bpo-27987: pymalloc: align by 16bytes on 64bit platform by methane · Pull Request #12850 · python/cpython (original) (raw)

Ok, but would it be possible to measure the "real" memory usage of a Python program?

$ ./python -c 'import django, sys; sys._debugmallocstats()'

master:

10 arenas * 262144 bytes/arena     =            2,621,440

# bytes in allocated blocks        =            2,277,200
# bytes in available blocks        =              145,016
33 unused pools * 4096 bytes       =              135,168
# bytes lost to pool headers       =               29,136
# bytes lost to quantization       =               34,920
# bytes lost to arena alignment    =                    0
Total                              =            2,621,440

align-16byte:

10 arenas * 262144 bytes/arena     =            2,621,440

# bytes in allocated blocks        =            2,370,592
# bytes in available blocks        =               87,552
25 unused pools * 4096 bytes       =              102,400
# bytes lost to pool headers       =               29,520
# bytes lost to quantization       =               31,376
# bytes lost to arena alignment    =                    0
Total                              =            2,621,440
>>> (2370592-2277200) / 2277200 * 100
4.101176883892499

About 4% increase in this case.

$ ./python-master -c 'import django, os; os.system(f"grep VmPeak /proc/{os.getpid()}/status")'
VmPeak:    15624 kB

$ ./python-align-16byte -c 'import django, os; os.system(f"grep VmPeak /proc/{os.getpid()}/status")'
VmPeak:    15624 kB

No visible impact about VmPeak.