msg242281 - (view) |
Author: Buck Evan (bukzor) * |
Date: 2015-04-30 18:59 |
In the attached example I show that there's a significant memory overhead present whenever a pre-compiled pyc is not present. This only occurs with more than 5225 objects (dictionaries in this case) allocated. At 13756 objects, the mysterious pyc overhead is 50% of memory usage. I've reproduced this issue in python 2.6, 2.7, 3.4. I imagine it's present in all cpythons. $ python -c 'import repro' 16736 $ python -c 'import repro' 8964 $ python -c 'import repro' 8964 $ rm *.pyc; python -c 'import repro' 16740 $ rm *.pyc; python -c 'import repro' 16736 $ rm *.pyc; python -c 'import repro' 16740 |
|
|
msg242282 - (view) |
Author: Buck Evan (bukzor) * |
Date: 2015-04-30 19:01 |
Also, we've reproduced this in both linux and osx. |
|
|
msg242284 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2015-04-30 19:34 |
This is transitory memory consumption. Once the source is compiled to bytecode, memory consumption falls down to its previous level. Do you care that much about it? |
|
|
msg242296 - (view) |
Author: Anthony Sottile (asottile) * |
Date: 2015-05-01 00:47 |
Adding `import gc; gc.collect()` doesn't change the outcome afaict |
|
|
msg242301 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2015-05-01 10:40 |
> Adding `import gc; gc.collect()` doesn't change the outcome afaict Of course it doesn't. The memory has already been released. "ru_maxrss" is the maximum memory consumption during the whole process lifetime. Add the following at the end of your script (Linux): import os, re, resource print(resource.getrusage(resource.RUSAGE_SELF).ru_maxrss) with open("/proc/%d/status" % os.getpid(), "r") as f: for line in f: if line.split(':')[0] in ('VmHWM', 'VmRSS'): print(line.strip()) And you'll see that VmRSS has already fallen back to the same level as when the pyc is not recompiled (it's a little bit more, perhaps due to fragmentation): $ rm -r __pycache__/; ./python -c "import repro" 19244 VmHWM: 19244 kB VmRSS: 12444 kB $ ./python -c "import repro" 12152 VmHWM: 12152 kB VmRSS: 12152 kB ("VmHWM" - the HighWater Mark - is the same as ru_maxrss) |
|
|
msg242324 - (view) |
Author: Anthony Sottile (asottile) * |
Date: 2015-05-01 14:37 |
I'm still seeing a very large difference: asottile@work:/tmp$ python repro.py ready <module 'city_hoods' from '/tmp/city_hoods.pyc'> 72604 VmHWM: 72604 kB VmRSS: 60900 kB asottile@work:/tmp$ rm *.pyc; python repro.py ready <module 'city_hoods' from '/tmp/city_hoods.py'> 1077232 VmHWM: 1077232 kB VmRSS: 218040 kB This file is significantly larger than the one attached, not sure if it makes much of a difference. |
|
|
msg242327 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2015-05-01 15:32 |
Which Python version is that? Can you try with 3.4 or 3.5? (is it under GNU/Linux?) > This file is significantly larger than the one attached, not sure > if it makes much of a difference. Python doesn't make a difference internally, but perhaps it has some impact on your OS' memory management. |
|
|
msg242328 - (view) |
Author: Anthony Sottile (asottile) * |
Date: 2015-05-01 15:39 |
3.4 seems happier: asottile@work:/tmp$ rm *.pyc; python3.4 repro.py ready <module 'city_hoods' from '/tmp/city_hoods.py'> 77472 VmHWM: 77472 kB VmRSS: 65228 kB asottile@work:/tmp$ python3.4 repro.py ready <module 'city_hoods' from '/tmp/city_hoods.py'> 77472 VmHWM: 77472 kB VmRSS: 65232 kB The nasty result above is from 2.7: $ python Python 2.7.6 (default, Mar 22 2014, 22:59:56) [GCC 4.8.2] on linux2 3.3 also seems to have the same exaggerated problem: $ rm *.pyc -f; python3.3 repro.py ready <module 'city_hoods' from '/tmp/city_hoods.py'> 1112996 VmHWM: 1112996 kB VmRSS: 133468 kB asottile@work:/tmp$ python3.3 repro.py ready <module 'city_hoods' from '/tmp/city_hoods.py'> 81392 VmHWM: 81392 kB VmRSS: 69304 kB $ python3.3 Python 3.3.6 (default, Jan 28 2015, 17:27:09) [GCC 4.8.2] on linux So seems the leaky behaviour was fixed at some point, any ideas of what change fixed it and is there a possibility of backporting it to 2.7? |
|
|
msg242329 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2015-05-01 15:40 |
Note under 3.x, you need to "rm -r __pycache__", not "rm *.pyc", since the pyc files are now stored in the __pycache__ subdirectory. |
|
|
msg242330 - (view) |
Author: Anthony Sottile (asottile) * |
Date: 2015-05-01 15:42 |
Ah, then 3.4 still has the problem: $ rm -rf __pycache__/ *.pyc; python3.4 repro.py ready <module 'city_hoods' from '/tmp/city_hoods.py'> 1112892 VmHWM: 1112892 kB VmRSS: 127196 kB asottile@work:/tmp$ python3.4 repro.py ready <module 'city_hoods' from '/tmp/city_hoods.py'> 77468 VmHWM: 77468 kB VmRSS: 65228 kB |
|
|
msg242331 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2015-05-01 15:47 |
Is there any chance you can upload a script that's large enough to exhibit the problem? (perhaps with anonymized data if there's something sensitive in there) |
|
|
msg242332 - (view) |
Author: Anthony Sottile (asottile) * |
Date: 2015-05-01 15:59 |
Attached is repro2.py (slightly different so my editor doesn't hate itself when editing the file) I'll attach the other file in another comment since it seems I can only do one at a time |
|
|
msg242339 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2015-05-01 17:31 |
Ok, I can reproduce: $ rm -r __pycache__/; ./python repro2.py ready <module 'anon_city_hoods' from '/home/antoine/cpython/opt/anon_city_hoods.py'> 1047656 VmHWM: 1047656 kB VmRSS: 50660 kB $ ./python repro2.py ready <module 'anon_city_hoods' from '/home/antoine/cpython/opt/anon_city_hoods.py'> 77480 VmHWM: 77480 kB VmRSS: 15664 kB My guess is that memory fragmentation prevents the RSS mark to drop any further, though one cannot rule out the possibility of an actual memory leak. |
|
|
msg242340 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2015-05-01 17:32 |
(by the way, my numbers are with Python 3.5 - the in-development version - on 64-bit Linux) |
|
|
msg242351 - (view) |
Author: Buck Evan (bukzor) * |
Date: 2015-05-01 20:32 |
New data: The memory consumption seems to be in the compiler rather than the marshaller: ``` $ PYTHONDONTWRITEBYTECODE=1 python -c 'import repro' 16032 $ PYTHONDONTWRITEBYTECODE=1 python -c 'import repro' 16032 $ PYTHONDONTWRITEBYTECODE=1 python -c 'import repro' 16032 $ python -c 'import repro' 16032 $ PYTHONDONTWRITEBYTECODE=1 python -c 'import repro' 8984 $ PYTHONDONTWRITEBYTECODE=1 python -c 'import repro' 8984 $ PYTHONDONTWRITEBYTECODE=1 python -c 'import repro' 8984 ``` We were trying to use PYTHONDONTWRITEBYTECODE as a workaround to this issue, but it didn't help us because of this. |
|
|
msg242379 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2015-05-02 05:29 |
The use of PYTHONDONTWRITEBYTECODE is not a workaround because it makes your to have memory overhead unconditionally. The compiler needs more momory than require compiled data itself. If this is an issue, I suggest to use different representation for the data: JSON, pickle, or just marshal. Also it may be faster. Try also CSV or custom simple format if it is appropriate. |
|
|
msg242583 - (view) |
Author: Buck Evan (bukzor) * |
Date: 2015-05-04 21:31 |
@serhiy.storchaka This is a very stable piece of a legacy code base, so we're not keen to refactor it so dramatically, although we could. We've worked around this issue by compiling pyc files ahead of time and taking extra care that they're preserved through deployment. This isn't blocking our 2.7 transition anymore. |
|
|
msg318505 - (view) |
Author: Jonathan G. Underwood (jonathan.underwood) |
Date: 2018-06-02 16:37 |
Seeing a very similar problem - very high memory useage during byte compilation. Consider the very simple code in a file: ``` def test_huge(): try: huge = b'\0' * 0x100000000 # this allocates 4GB of memory! except MemoryError: print('OOM') ``` Running this sequence of commands shows that during byte compilation, 4 GB memory is used. Presumably this is because of the `huge` object - note of course the function isn't actually executed. ``` valgrind --tool=massif python memdemo.py ms_print massif.out.7591 | less ``` You'll need to replace 7591 with whatever process number valgrind reports. Is there any hope of fixing this? It's currently a problem for me when running tests on Travis, where the memory limit is 3GB. I had hoped to use a conditional like the above to skip tests that would require more memory than is available. However, the testing is killed before that simply because the byte compilation is causing an OOM. |
|
|
msg318507 - (view) |
Author: R. David Murray (r.david.murray) *  |
Date: 2018-06-02 18:29 |
That's presumably due to the compile-time constant-expression optimization. Have you tried bytes(0x1000000)? I don't think that gets treated as a constant by the optimizer (but I could be wrong since a bunch of things ahve been added to it lately). |
|
|
msg318508 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2018-06-02 18:31 |
Jonathan, this is a different problem, and it is fixed in 3.6+ (see ). |
|
|
msg318509 - (view) |
Author: Jonathan G. Underwood (jonathan.underwood) |
Date: 2018-06-02 18:45 |
Thanks to both Serhiy Storchaka and David Murray - indeed you're both correct, and that is the issue in 21074, and the workaround from there of declaring a variable for that size fixes the problem. |
|
|
msg320980 - (view) |
Author: Inada Naoki (methane) *  |
Date: 2018-07-03 13:12 |
In case repro2, unreturned memory is in glibc malloc. jemalloc mitigates this issue. There are some fragmentation in pymalloc, but I think it's acceptable level. $ python3 -B repro2.py ready <module 'anon_city_hoods' from '/home/inada-n/anon_city_hoods.py'> 1079124 VmHWM: 1079124 kB VmRSS: 83588 kB $ LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libjemalloc.so.1 python3 -B repro2.py ready <module 'anon_city_hoods' from '/home/inada-n/anon_city_hoods.py'> 1108424 VmHWM: 1108424 kB VmRSS: 28140 kB |
|
|
msg320981 - (view) |
Author: Inada Naoki (methane) *  |
Date: 2018-07-03 13:26 |
since anon_city_hoods has massive constants, compiler_add_const makes dict larger and larger. It creates many large tuples too. I suspect it makes glibc malloc unhappy. Maybe, we can improve pymalloc for medium and large objects, by porting strategy from jemalloc. It can be good GSoC project. But I suggest close this issue as "won't fix" for now. |
|
|
msg320984 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2018-07-03 14:41 |
VmRSS for different versions: malloc jmalloc 2.7: 237316 kB 90524 kB 3.4: 53888 kB 14768 kB 3.5: 51396 kB 14908 kB 3.6: 90692 kB 31776 kB 3.7: 130952 kB 28296 kB 3.8: 130284 kB 27644 kB |
|
|