Issue 19533: Unloading docstrings from memory if -OO is given (original) (raw)

Created on 2013-11-09 06:54 by deleted250130, last changed 2022-04-11 14:57 by admin.

Files
File name Uploaded Description Edit
test.py deleted250130,2013-11-09 06:54
Messages (8)
msg202465 - (view) Author: (deleted250130) Date: 2013-11-09 06:54
Using -OO on a script will remove the __doc__ attributes but the docstrings will still be in the process memory. In the attachments is an example script which demonstrates this with a docstring of ~10 MiB (opening the file in an editor can need some time). Calling "python3 -OO test.py" will result in a memory usage of ~16 MiB on my system (Linux 64 Bit) while test.__doc__ is None.
msg202485 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2013-11-09 16:06
Do realize this is a one-time memory cost, though, because next execution will load from the .pyo and thus will never load the docstring into memory. If you pre-compile all bytecode with -OO this will never even occur.
msg202486 - (view) Author: (deleted250130) Date: 2013-11-09 16:24
> Do realize this is a one-time memory cost, though, because next execution will load from the .pyo and thus will never load the docstring into memory. Except in 2 cases: - The bytecode was previously generated with -O. - The bytecode couldn't be written (for example permission issues or Python was invoked with -B).
msg202491 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2013-11-09 18:35
So the question is, if there is no longer a reference to the docstring, why isn't it garbage collected? (I tested adding a gc.collect(), and it didn't make any difference.)
msg202500 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2013-11-09 20:43
R. David Murray <report@bugs.python.org> wrote: > So the question is, if there is no longer a reference to the docstring, why isn't it garbage collected? (I tested adding a gc.collect(), and it didn't make any difference.) I think it probably is garbage collected but the freed memory is not returned to the OS by the memory allocator.
msg202501 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2013-11-09 21:12
Hmm. If I turn on gc debugging before the def, I don't see anything get collected. If I allocate a series of new 10K strings, the memory keeps growing. Of course, that could still be down to the vagaries of OS memory management. Time to break out Victor's tracemalloc, but I probably don't have that much ambition today :)
msg202526 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2013-11-10 14:19
It looks like the memory management is based directly on Py_Arenas: def f(): """squeamish ossifrage""" pass Breakpoint 1, PyArena_Free (arena=0x9a5120) at Python/pyarena.c:159 159 assert(arena); (gdb) p arena->a_objects $1 = ['f', 'squeamish ossifrage'] (gdb) bt #0 PyArena_Free (arena=0x9a5120) at Python/pyarena.c:159 #1 0x0000000000425af5 in PyRun_FileExFlags (fp=0xa1b780, filename_str=0x7ffff7f37eb0 "docstr.py", start=257, globals= {'f': <function at remote 0x7ffff7f04058>, '__builtins__': <module at remote 0x7ffff7f6a358>, '__name__': '__main__', '__file__': 'docstr.py', '__package__': None, '__loader__': <SourceFileLoader(name='__main__', path='docstr.py') at remote 0x7ffff7ede608>, '__cached__': None, '__doc__': None}, locals= {'f': <function at remote 0x7ffff7f04058>, '__builtins__': <module at remote 0x7ffff7f6a358>, '__name__': '__main__', '__file__': 'docstr.py', '__package__': None, '__loader__': <SourceFileLoader(name='__main__', path='docstr.py') at remote 0x7ffff7ede608>, '__cached__': None, '__doc__': None}, closeit=1, flags=0x7fffffffe490) at Python/pythonrun.c:2114 #2 0x0000000000423a0c in PyRun_SimpleFileExFlags (fp=0xa1b780, filename=0x7ffff7f37eb0 "docstr.py", closeit=1, flags= 0x7fffffffe490) at Python/pythonrun.c:1589 #3 0x000000000042289c in PyRun_AnyFileExFlags (fp=0xa1b780, filename=0x7ffff7f37eb0 "docstr.py", closeit=1, flags=0x7fffffffe490) at Python/pythonrun.c:1276 #4 0x000000000043bc83 in run_file (fp=0xa1b780, filename=0x9669b0 L"docstr.py", p_cf=0x7fffffffe490) at Modules/main.c:336 #5 0x000000000043c8c5 in Py_Main (argc=3, argv=0x964020) at Modules/main.c:780 #6 0x000000000041cdb5 in main (argc=3, argv=0x7fffffffe688) at ./Modules/python.c:69 So the string 'squeamish ossifrage' is still in arena->a_objects right until end of PyRun_FileExFlags(), even with -OO.
msg363550 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2020-03-06 20:43
Do note that .pyc files now encode their optimization levels, so the only thing to potentially do here is change the compiler to toss docstrings out and make sure they are freed when they are parsed to avoid holding on to them.
History
Date User Action Args
2022-04-11 14:57:53 admin set github: 63732
2020-03-06 20:43:28 brett.cannon set nosy: - brett.cannon
2020-03-06 20:43:24 brett.cannon set messages: +
2014-05-22 22:09:30 skrah set nosy: - skrah
2014-04-24 05:52:19 pconnell set nosy: + pconnell
2013-11-10 14:19:40 skrah set messages: +
2013-11-09 21:12:13 r.david.murray set messages: +
2013-11-09 20:43:36 skrah set messages: +
2013-11-09 18:35:28 r.david.murray set nosy: + r.david.murraymessages: +
2013-11-09 16:24:18 deleted250130 set messages: +
2013-11-09 16:06:12 brett.cannon set nosy: + brett.cannonmessages: +
2013-11-09 09:09:31 serhiy.storchaka set versions: + Python 3.3, - Python 2.7
2013-11-09 09:08:48 serhiy.storchaka set messages: -
2013-11-09 09:08:37 serhiy.storchaka set type: behavior -> enhancementcomponents: - Testsversions: - Python 3.3, Python 3.4
2013-11-09 09:05:08 serhiy.storchaka set versions: + Python 2.7, Python 3.4nosy: + skrah, serhiy.storchakamessages: + components: + Teststype: enhancement -> behavior
2013-11-09 06:54:19 deleted250130 create