bpo-27987: pymalloc: align by 16bytes on 64bit platform by methane · Pull Request #12850 · python/cpython (original) (raw)
Ok, but would it be possible to measure the "real" memory usage of a Python program?
$ ./python -c 'import django, sys; sys._debugmallocstats()'
master:
10 arenas * 262144 bytes/arena = 2,621,440
# bytes in allocated blocks = 2,277,200
# bytes in available blocks = 145,016
33 unused pools * 4096 bytes = 135,168
# bytes lost to pool headers = 29,136
# bytes lost to quantization = 34,920
# bytes lost to arena alignment = 0
Total = 2,621,440
align-16byte:
10 arenas * 262144 bytes/arena = 2,621,440
# bytes in allocated blocks = 2,370,592
# bytes in available blocks = 87,552
25 unused pools * 4096 bytes = 102,400
# bytes lost to pool headers = 29,520
# bytes lost to quantization = 31,376
# bytes lost to arena alignment = 0
Total = 2,621,440
>>> (2370592-2277200) / 2277200 * 100
4.101176883892499
About 4% increase in this case.
$ ./python-master -c 'import django, os; os.system(f"grep VmPeak /proc/{os.getpid()}/status")'
VmPeak: 15624 kB
$ ./python-align-16byte -c 'import django, os; os.system(f"grep VmPeak /proc/{os.getpid()}/status")'
VmPeak: 15624 kB
No visible impact about VmPeak.