[Python-Dev] PEP 550 leak-in vs leak-out, why not just a ChainMap (original) (raw)

Jim J. Jewett jimjjewett at gmail.com
Thu Aug 24 10:05:05 EDT 2017


On Thu, Aug 24, 2017 at 1:12 AM, Yury Selivanov > On Thu, Aug 24, 2017 at 12:32 AM, Jim J. Jewett <jimjjewett at gmail.com> wrote:

The key requirement for using immutable datastructures is to make "getexecutioncontext" operation fast.

Do you really need the whole execution context, or do you just need the current value of a specific key? (Or, sometimes, the writable portion of the context.)

Currently, the PEP doesn't do a good job at explaining why we need that operation and why it will be used by asyncio.Task and callsoon, so I understand the confusion.

OK, the schedulers need the whole context, but if implemented as a ChainMap (instead of per-key), isn't that just a single constant? As in, don't they always schedule to the same thread? And when they need another map, isn't that because the required context is already available from whichever code requested the scheduling?

(A) How many values do you expect a typical generator to use? The django survey suggested mostly 0, sometimes 1, occasionally 2. So caching the values of all possible keys probably won't pay off.

Not many, but caching is still as important, because some API users want the "get()" operation to be as fast as possible under all conditions.

Sure, but only because they view it as a hot path; if the cost of that speedup is slowing down another hot path, like scheduling the generator in the first place, it may not be worth it.

According to the PEP timings, HAMT doesn't beat a copy-on-write dict until over 100 items, and never beats a regular dict. That suggests to me that it won't actually help the overall speed for a typical (as opposed to worst-case) process.

And, of course, using a ChainMap means that the keys do NOT have to be predefined ... so the Key class really can be skipped.

The first version of the PEP had no ContextKey object and the most popular complaint about it was that the key names will clash.

That is true of any global registry. Thus the use of keys with prefixes like com.sun.

The only thing pre-declaring a ContextKey buys in terms of clashes is that a sophisticated scheduler would have less difficulty figuring out which clashes will cause thrashing in the cache.

Or are you suggesting that the key can only be declared once (as opposed to once per piece of code), so that the second framework to use the same name will see a RuntimeError?

-jJ



More information about the Python-Dev mailing list