Here is the Nth patch for a globals/builtins cache. As other caches at the same kind, it shows very small to no gain on non-micro benchmarks, showing that contrary to popular belief, globals/builtins lookup are not a major roadblock in today's Python performance. However, this patch could be useful in combination with other optimizations such as . Indeed, using the globals/builtins version id, it is easy and very cheap to detect whether the function pointed to by a global name has changed or not. As for micro-benchmarks, they show that there is indeed a good improvement on builtins lookups: $ ./python -m timeit "x=len;x=len;x=len;x=len;x=len;x=len;x=len;x=len;x=len;x=len;" -> without patch: 1000000 loops, best of 3: 0.282 usec per loop -> with patch: 10000000 loops, best of 3: 0.183 usec per loop
There aren't many possible approaches. The more complex variants of globals caches try to also speedup writes, which is IMO a waste of time since rebinding globals is not a good coding practice, and especially not in the middle of time-critical loops. (by the way, the patch only addresses normal functions, but generators would easily benefit from a similar treatment) And Skip is right that this would be most useful when paired with a JIT (allowing for aggressive specialization, and possibly inlining).