[Python-Dev] PATCH: Fast globals/builtins lookups for 2.6 (original) (raw)

Neil Toronto ntoronto at cs.byu.edu
Thu Nov 29 11:26:37 CET 2007


I've posted a patch here:

http://bugs.python.org/issue1518

for 2.6a0 that speeds up LOAD_GLOBAL to where it's just barely slower than LOAD_FAST for both globals and builtins. It started life here, in Python-ideas:

http://mail.python.org/pipermail/python-ideas/2007-November/001212.html

The idea is to cache pointers to dictionary entries rather than dictionary values as is usually suggested for speeding up globals. In my original approach, every dictionary would keep a list of "observers" that it would notify when an entry pointer became invalid. However, the surgery on PyDictObject became too invasive, to the point where the new code was affecting performance of unrelated code paths. (It was probably caching issues.) It also made some (very rare) operations on builtins and globals very, very slow.

The new approach tags each PyDictObject with a "version": a 64-bit value that is incremented for every operation that invalidates at least one entry pointer. (Those are inserting set, delete, pop, clear and resize. Non-inserting set and get are unaffected.) In this approach, changes to PyDictObject are uninvasive and do not affect performance in any measurable way (as far as I can tell).

Every function has a PyFastGlobalsObject, which keeps entry pointers for everything in co_names and tracks its dicts' versions so it can update the pointers when one might have become invalid. LOAD_GLOBALS uses the PyFastGlobalsObject to get globals and builtins by name index rather than by string key.

With the patch, Python behaves differently in these two ways (as far as I can tell):

  1. As before, a builtins ({'None': None}) is created for frames whose globals do not have one. Unlike before, it's added to the globals dict rather than just kept internally. This made implementation easier - I don't think it's a big deal (but I might be wrong).

  2. A change of builtins (its value, not its contents) always appears at the beginning of a stack frame. (Before, changing builtins in a module would be undetectable if function calls were kept within that module.) This is likely of little importance.

I'd love to see it included. I have donned my asbestos underwear and my best chain mail shirt and I'm begging for comments.

Neil



More information about the Python-Dev mailing list