[Python-Dev] Extracting python bytecode from a linux core dump? (original) (raw)

Skip Montanaro skip.montanaro at gmail.com
Fri Jun 9 14:38:09 EDT 2017


I have a core file (produced via the gcore command) of a linux python2.6 process. I need to extract the byte code and de-compile it.

Following on Steve's comment, you might want to take a look at Misc/gdbinit for some GDB command inspiration. You are correct, you won't have a running process, but I think you should be able source that file (maybe with tweaks, depending on the Python version you are debugging), then move up and down the C call stack, poke around in the C locals, then follow pointers to the currently active functions, then for those which are Python functions, follow the func_code attribute to get the code object. I can't remember what the actual bytecode attribute is called in the code object. (It's been too many years.)

However, these all seem to require either a running process and/or a binary with debugging symbols.

Yeah, you're going to have a lot of fun with a stripped executable. If you're debugging a core file from an interpreter compiled with much in the way of compiler optimization, many of the local variables will have been optimized out. You'll likely be stuck rummaging around until you figure out the pattern of where the compiler put things (register-wise).

I'm thinking that the compiled bytecode is likely in an array or contiguous set of memory within the python executable's image and that there's probably a way to pull it out with gdb. Unsurprisingly, the pyc 0xd1f20d0a magic number isn't kept in memory. So, how do I find the memory holding the compiled byte-code ?

Correct. The module level bytecode is executed once at import time, then discarded, at least that used to be how it was done.

Skip



More information about the Python-Dev mailing list