[Python-Dev] Basic pymalloc stats (original) (raw)
Tim Peters tim.one@comcast.net
Fri, 05 Apr 2002 01:17:13 -0500
- Previous message: [Python-Dev] GC, flags, and subtyping
- Next message: [Python-Dev] autotools online book
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
FYI, I implemented the optimizations Vladimir and I discussed here.
Next, _PyMalloc_DebugDumpStats() is an entry point you can call in a debug build (or when PYMALLOC_DEBUG is enabled in a release build) to get a snapshot of pymalloc's internal structures. Perhaps it should be enabled in a release build too without PYMALLOC_DEBUG -- as is, because PYMALLOC_DEBUG is enabled, every allocation is bumped by 16 bytes to make room for PYMALLOC_DEBUG's memory decorations.
Here's sample output (recently greatly improved), from near the tail end of a debug-build run of the test suite:
Small block threshold = 256, in 32 size classes. pymalloc malloc+realloc called 4414692 times.
class num bytes num pools blocks in use avail blocks
5 48 773 64932 0
6 56 266 19028 124
7 64 288 18122 22
8 72 124 6914 30
9 80 178 8873 27
10 88 41 1867 19 11 96 28 1170 6 12 104 21 798 21 13 112 16 543 33 14 120 11 359 4 15 128 8 228 20 16 136 5 141 4 17 144 5 114 26 18 152 13 295 43 19 160 6 144 6 20 168 138 3292 20 21 176 5 96 19 22 184 4 76 12 23 192 3 43 20 24 200 3 42 18 25 208 3 40 17 26 216 3 43 11 27 224 2 29 7 28 232 3 32 19 29 240 2 21 11 30 248 2 31 1 31 256 2 21 9
31 arenas * 262144 bytes/arena = 8126464
0 unused pools * 4096 bytes = 0
bytes in allocated blocks = 7796144
bytes in available blocks = 69056
bytes lost to pool headers = 62496
bytes lost to quantization = 71792
bytes lost to arena alignment = 126976
Total = 8126464
Running the Unicode tests vastly increases the number of the smallest blocks in use. The hump in the 168-byte class is due to small dicts.
Feel lightly encouraged to try calling this in your real programs now, and strongly encouraged after the memory-API rework is complete.
Try very hard not to read too much into the test suite . All I take from the above is that memory utilization is excellent; fragmentation is trivial (e.g., in the 56-byte class, 124 available blocks * 56 bytes/block is greater than a 4096-byte pool, so in an ideal world we could get away with 265 pools of this size instead of 266); and the wastage due to tossing away "the ends" of arenas to leave pool-aligned pools ("arena alignment") is significant (compared to the other kinds of pure waste in pymalloc -- "quantization" means stuff lost to that the available bytes in a pool often aren't an exact multiple of the pool's block size), but that overall wastage is low. Note that there's no accounting here for what's lost due to returning 8-byte aligned addresses.
- Previous message: [Python-Dev] GC, flags, and subtyping
- Next message: [Python-Dev] autotools online book
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]