[Python-Dev] [ python-Patches-876206 ] scary frame speed hacks (original) (raw)

Tim Peters tim.one at comcast.net
Tue Mar 2 22:45:46 EST 2004


[Neal Norwitz]

In Include/frameobject.h, btype and blevel can be combined to a single 32-bit value, rather than two in PyTryBlock. There is a bit more processing to pull the values apart. IIRC, there was a very small, but measurable performance hit. You can also decrease COMAXBLOCKS. I was able to drop the size to under 256 bytes. But perf was still a bit off.

PyTryBlock is indeed the frameobject memory pig, but it shouldn't need to be -- the size needed varies by code object, but is fixed per code object, and is usually very small (IIRC, it's the max nesting depth in the body of the code, counting only nested loops and "try" structures (not counting nested def, class, or "if" structures)). So it should be possible, e.g., to shove it off to the variable-size tail of a frame, and allocate only exactly as much as a given code object needs. For that matter, we wouldn't need an arbitrary upper bound (currently 20) on nesting depth then either.

My goal was to drop the frame size small enough for PyMalloc. I don't think I ever tried to change the allocs. But since I never got it faster, I dropped it.

Note that since frames participate in cyclic gc, each needs another 12 (Linux) or 16 (Windows) bytes for the gc header too. That's why I said "> 350" at the start when everyone else was quoting basicsize as if the latter had something to do with reality .



More information about the Python-Dev mailing list