[Python-Dev] PEP 550 leak-in vs leak-out, why not just a ChainMap (original) (raw)

Nathaniel Smith njs at pobox.com
Thu Aug 24 04:22:52 EDT 2017

Previous message (by thread): [Python-Dev] PEP 550 leak-in vs leak-out, why not just a ChainMap
Next message (by thread): [Python-Dev] Scope, not context? (was Re: PEP 550 v3 naming)
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Wed, Aug 23, 2017 at 9:32 PM, Jim J. Jewett <jimjjewett at gmail.com> wrote:

While the context is defined conceptually as a nested chain of key:value mappings, we avoid using the mapping syntax because of the way the values can shift dynamically out from under you based on who called you ... instead of having the problem of changes inside the generator leaking out, we instead had the problem of changes outside the generator not making their way in I still don't see how this is different from a ChainMap. If you are using a stack(chain) of [dglobal, dthread, dA, dB, dC, dmine] maps as your implicit context, then a change to dthread map (that some other code could make) will be visible unless it is masked. Similarly, if the fallback for dC changes from dB to dB1 (which points directly to dthread), that will be visible for any keys that were previously resolved in dA or dB, or are now resolved in dB1. Those seem like exactly the cases that would (and should) cause "shifting values". This does mean that you can't cache everything in the localmost map, but that is a problem with the optimization regardless of how the implementation is done.

It's crucial that the only thing that can effect the result of calling ContextKey.get() is other method calls on that same ContextKey within the same thread. That's what enables ContextKey to efficiently cache repeated lookups, which is an important optimization for code like decimal or numpy that needs to access their local context extremely quickly (like, a dict lookup is too slow). In fact this caching trick is just preserving what decimal does right now in their thread-local handling (because accessing the threadstate dict is too slow for them).

So we can't expose any kind of mutable API to individual maps, because then someone might call setitem on some map that's lower down in the stack, and break caching.

And, of course, using a ChainMap means that the keys do NOT have to be predefined ... so the Key class really can be skipped.

The motivations for the Key class are to eliminate the need to worry about accidental key collisions between unrelated libraries, to provide some important optimizations (as per above), and to make it easy and natural to provide convenient APIs like for saving and restoring the state of a value inside a context manager. Those are all orthogonal to whether the underlying structure is implemented as a ChainMap or as something more specialized.

But I tend to agree with your general argument that the current PEP is trying a bit too hard to hide away all this structure where no-one can see it. The above constraints mean that simply exposing a ChainMap as the public API is probably a bad idea. Even if there are compelling performance advantages to fancy immutable-HAMT implementation (I'm in wait-and-see mode on this myself), then there are still a lot more introspection-y operations that could be provided, like:

make a LocalContext out of a {ContextKey: value} dict
get a {ContextKey: value} dict out of a LocalContext
get the underlying list of LocalContexts out of an ExecutionContext
create an ExecutionContext out of a list of LocalContexts
given a ContextKey and a LocalContext, get the current value of the key in the context
given a ContextKey and an ExecutionContext, get out a list of values at each level

-n

-- Nathaniel J. Smith -- https://vorpus.org

Previous message (by thread): [Python-Dev] PEP 550 leak-in vs leak-out, why not just a ChainMap
Next message (by thread): [Python-Dev] Scope, not context? (was Re: PEP 550 v3 naming)
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the Python-Dev mailing list