Issue 8482: test_gdb, gdb/libpython.py: Unable to read information on python frame (original) (raw)

Issue8482

Created on 2010-04-21 10:12 by ncoghlan, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Messages (12)
msg103808 - (view) Author: Alyssa Coghlan (ncoghlan) * (Python committer) Date: 2010-04-21 10:12
Remaining failure after resolution of : ====================================================================== FAIL: test_basic_command (test.test_gdb.PyBtTests) Verify that the "py-bt" command works ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/ncoghlan/devel/python/Lib/test/test_gdb.py", line 638, in test_basic_command ''') File "/home/ncoghlan/devel/python/Lib/test/test_gdb.py", line 158, in assertMultilineMatches msg='%r did not match %r' % (actual, pattern)) AssertionError: 'Breakpoint 1 at 0x453510: file Objects/object.c, line 330.\n[Thread debugging using libthread_db enabled]\n\nBreakpoint 1, PyObject_Print (op=42, fp=0x7ffff7532780, flags=1)\n at Objects/object.c:330\n330\t\treturn internal_print(op, fp, flags, 0);\n#3 Frame 0x808680, for file /home/ncoghlan/devel/python/Lib/test/gdb_sample.py, line 10, in baz (args=(1, 2, 3))\n print(42)\n#7 (unable to read python frame information)\n#10 Frame 0x81a220, for file /home/ncoghlan/devel/python/Lib/test/gdb_sample.py, line 7, in bar (a=1, b=2, c=3)\n baz(a, b, c)\n#13 Frame 0x807f00, for file /home/ncoghlan/devel/python/Lib/test/gdb_sample.py, line 4, in foo (a=1, b=2, c=3)\n bar(a, b, c)\n' did not match '^.*\n#[0-9]+ Frame 0x[0-9a-f]+, for file .*gdb_sample.py, line 7, in bar \\(a=1, b=2, c=3\\)\n baz\\(a, b, c\\)\n#[0-9]+ Frame 0x[0-9a-f]+, for file .*gdb_sample.py, line 4, in foo \\(a=1, b=2, c=3\\)\n bar\\(a, b, c\\)\n#[0-9]+ Frame 0x[0-9a-f]+, for file .*gdb_sample.py, line 12, in \\(\\)\nfoo\\(1, 2, 3\\)\n'
msg103809 - (view) Author: Alyssa Coghlan (ncoghlan) * (Python committer) Date: 2010-04-21 10:13
And Dave's comment from the other issue: Reading the frame information seems to be highly sensitive to the optimization level and the exact version of gcc for the build, and the exact version of gdb, alas. I've been tracking a failure like the one you describe, seen on 64-bit with Fedora in our downstream tracker here: https://bugzilla.redhat.com/show_bug.cgi?id=556975
msg103810 - (view) Author: Alyssa Coghlan (ncoghlan) * (Python committer) Date: 2010-04-21 10:17
Potentially relevant gcc and gdb version info: ~$ gcc --version gcc (Ubuntu 4.4.1-4ubuntu9) 4.4.1 Copyright (C) 2009 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. ~$ gdb --version GNU gdb (GDB) 7.0-ubuntu Copyright (C) 2009 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-linux-gnu". For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>. Python build is currently just a straight unmodified call to "make". I'll see if I still get the error if I tell ./configure to set up for a debug build of Python.
msg103815 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010-04-21 10:26
Readable version of the error message (with newlines): --------- AssertionError: 'Breakpoint 1 at 0x453510: file Objects/object.c, line 330. [Thread debugging using libthread_db enabled] Breakpoint 1, PyObject_Print (op=42, fp=0x7ffff7532780, flags=1) at Objects/object.c:330 330 return internal_print(op, fp, flags, 0); #3 Frame 0x808680, for file /home/ncoghlan/devel/python/Lib/test/gdb_sample.py, line 10, in baz (args=(1, 2, 3)) print(42) #7 (unable to read python frame information) #10 Frame 0x81a220, for file /home/ncoghlan/devel/python/Lib/test/gdb_sample.py, line 7, in bar (a=1, b=2, c=3) baz(a, b, c) #13 Frame 0x807f00, for file /home/ncoghlan/devel/python/Lib/test/gdb_sample.py, line 4, in foo (a=1, b=2, c=3) bar(a, b, c) ' did not match '^.* #[0-9]+ Frame 0x[0-9a-f]+, for file .*gdb_sample.py, line 7, in bar \(a=1, b=2, c=3\) baz\(a, b, c\) #[0-9]+ Frame 0x[0-9a-f]+, for file .*gdb_sample.py, line 4, in foo \(a=1, b=2, c=3\) bar\(a, b, c\) #[0-9]+ Frame 0x[0-9a-f]+, for file .*gdb_sample.py, line 12, in \(\) foo\(1, 2, 3\) ' ---------
msg103818 - (view) Author: Alyssa Coghlan (ncoghlan) * (Python committer) Date: 2010-04-21 10:59
Thanks Victor. The test actually runs fine under "./configure --with-pydebug", but reverting to a plain "./configure" and rebuilding means I get the error again. Definitely sounds like it could be due to the compiler failing to make some relevant info available to the debugger.
msg103824 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010-04-21 12:01
> The test actually runs fine under "./configure --with-pydebug" (...) I changed configure to disable all compiler optimisations if --with-pydebug is used. Retry with ./configure CFLAGS="-O0". A variable may be optimized (use a register instead a variable allocated on the stack) and gdb is unable to get its value. If we are able to isole the variable, you could try to ask gcc to not optimize it... but only if it doesn't change Python performances too much. Another simple approach is to disable the test if Python was compiled with compiler optimisation. To detect optimization, we should use a gdb batch script trying to read the optimized (or not) variable.
msg103840 - (view) Author: Alyssa Coghlan (ncoghlan) * (Python committer) Date: 2010-04-21 13:09
I'll wait for Dave to weigh in before I dig any further - the problem I am seeing seems to have a lot in common with the issue he reported on 64-bit Fedora, but I only followed about half of that bug discussion.
msg103934 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010-04-22 00:55
test_gdb of py3k now pass with ./configure --with-pydebug (no optimization <=> -O0), but fail (Unable to read information on python frame) with ./configure (-O3). The test does also fail with -O1. Recompile only ceval.c with -O0 is enough to fix this issue: ---- $ rm ./Python/ceval.o $ vim Makefile # replace -O3 by -O0 $ make && ./python Lib/test/regrtest.py -v test_gdb # success ----
msg103964 - (view) Author: Dave Malcolm (dmalcolm) (Python committer) Date: 2010-04-22 13:21
See https://bugzilla.redhat.com/show_bug.cgi?id=556975#c21 I'll try to summarize: as I understand things, the issue is that on 64-bit gcc builds with enough optimization, the argument "PyFrameObject *f" is passed as a register, and that gets clobbered at various locations within the implementation of PyEval_EvalFrameEx. The DWARF debug data contains a mapping between ranges of program counter values and methods for calculating the values of variables, and in many locations within PyEval_EvalFrameEx, it doesn't have enough information to figure out what the value of "f" is. It works on i386: the calling conventions are different, and it's fairly easy for gdb to walk up the stack and find "f". I believe my RH colleagues who hack on gcc and gdb are working on improving this so that DWARF can handle such cases (where "f" doesn't change during the function call), but Jakub describes that as "at least a year from now" In the meantime there are some options. (i) A dirty hack is that in some places in the function on x86_64 I've seen that the value of "f" is stored in the rbpregister.Wecouldusesomethinglikethisasafallbackforwhendirectlyqueryinggdbfor"f"fails:typePyFrameObjectPtr=gdb.lookuptype(′PyFrameObject′).pointer()f=gdb.parseandeval(′rbp register. We could use something like this as a fallback for when directly querying gdb for "f" fails: _type_PyFrameObjectPtr = gdb.lookup_type('PyFrameObject').pointer() f = gdb.parse_and_eval('rbpregister.Wecouldusesomethinglikethisasafallbackforwhendirectlyqueryinggdbfor"f"fails:typePyFrameObjectPtr=gdb.lookuptype(PyFrameObject).pointer()f=gdb.parseandeval(rbp').cast(_type_PyFrameObjectPtr) and then see if the value looks sane (e.g. does it have a meaningful ob_type? does 0 ==strcmp($rbp->ob_type->tp_name, "frame"); if so, then that's probably the value of "f", and use it. Obviously that would be highly fragile; it may be something I'm seeing on my particular version of gcc, but might not work for other people's gcc. Also I suspect that it will give misleading results when traversing the stack of C-level frames. Another approach would be to simply accept that it isn't going to work on x86_64 with gcc with optimizations turned on until a later version of gcc. We could handle this: (ii) by trying to detect this situation during test_gdb.py, and to skip the affected tests. (iii) by generalizing the tests so that they can handle the case where arbitrary frames aren't readable. Approach (iii) seems most promising to me, and I'll try to cook up a patch (though I want to try approach (i), as it would be nice to get visibility into these frames)
msg103968 - (view) Author: Alyssa Coghlan (ncoghlan) * (Python committer) Date: 2010-04-22 13:46
Thanks, I understood your summary significantly better than I did the bug discussion :) Any of your 3 options sounds reasonable to me (although i. sounds potentially fragile in the face of different versions of gcc, so iii. might be necessary anyway). The quick and dirty approach would be a variant of ii. that just skipped the offending test for GCC on x86_64 machines. The platform library would make that fairly straightforward: >>> platform.python_compiler() 'GCC 4.4.1' >>> platform.machine() 'x86_64'
msg103969 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010-04-22 13:57
> the issue is that on 64-bit gcc builds with enough optimization, > the argument "PyFrameObject *f" is passed as a register, > and that gets clobbered at various locations within the > implementation of PyEval_EvalFrameEx. test_gdb fails (Unable to read information on python frame) on my i386 computer (32 bits) with -O1 (but it doesn't with -O0). I'm using Debian Sid: gcc 4.4.3 and gdb 7.1.
msg109568 - (view) Author: Dave Malcolm (dmalcolm) (Python committer) Date: 2010-07-08 19:01
> test_gdb fails (Unable to read information on python frame) on my i386 > computer (32 bits) with -O1 (but it doesn't with -O0). I'm using Debian > Sid: gcc 4.4.3 and gdb 7.1. This should be fixed now that issue 8605 is resolved: we now skip test_gdb if the compiler optimization level is above -O0
History
Date User Action Args
2022-04-11 14:57:00 admin set github: 52728
2010-07-08 19:04:12 dmalcolm set status: open -> closed
2010-07-08 19:01:32 dmalcolm set resolution: fixedmessages: + stage: needs patch -> resolved
2010-06-06 01:39:21 ezio.melotti set nosy: + ezio.melotti
2010-04-22 13:57:29 vstinner set messages: + title: test_gdb - "(unable to read python frame information)" mismatch -> test_gdb, gdb/libpython.py: Unable to read information on python frame
2010-04-22 13:46:25 ncoghlan set messages: +
2010-04-22 13:25:10 dmalcolm link issue8494 superseder
2010-04-22 13:21:48 dmalcolm set messages: +
2010-04-22 00:55:34 vstinner set messages: +
2010-04-21 18:36:04 loewis set assignee: dmalcolm
2010-04-21 13:09:31 ncoghlan set messages: +
2010-04-21 13:09:22 ncoghlan set messages: -
2010-04-21 13:09:09 ncoghlan set messages: +
2010-04-21 12:01:54 vstinner set messages: +
2010-04-21 10:59:37 ncoghlan set messages: +
2010-04-21 10:26:56 vstinner set messages: +
2010-04-21 10:17:08 ncoghlan set messages: +
2010-04-21 10:13:19 ncoghlan set messages: +
2010-04-21 10:12:47 ncoghlan create