[Python-Dev] PEP 567 v2 (original) (raw)
Chris Jerdonek chris.jerdonek at gmail.com
Thu Dec 28 05:28:28 EST 2017
- Previous message (by thread): [Python-Dev] PEP 567 v2
- Next message (by thread): [Python-Dev] PEP 567 v2
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
I have a couple basic questions around how this API could be used in practice. Both of my questions are for the Python API as applied to Tasks in asyncio.
Would this API support looking up the value of a context variable for another Task? For example, if you're managing multiple tasks using asyncio.wait() and there is an exception in some task, you might want to examine and report the value of a context variable for that task.
Would an appropriate use of this API be to assign a unique task id to each task? Or can that be handled more simply? I'm wondering because I recently thought this would be useful, and it doesn't seem like asyncio means for one to subclass Task (though I could be wrong).
Thanks, --Chris
On Wed, Dec 27, 2017 at 10:08 PM, Yury Selivanov <yselivanov.ml at gmail.com> wrote:
This is a second version of PEP 567.
A few things have changed: 1. I now have a reference implementation: https://github.com/python/cpython/pull/5027 2. The C API was updated to match the implementation. 3. The getcontext() function was renamed to copycontext() to better reflect what it is really doing. 4. Few clarifications/edits here and there to address earlier feedback.
Yury PEP: 567 Title: Context Variables Version: RevisionRevisionRevision Last-Modified: DateDateDate Author: Yury Selivanov <yury at magic.io> Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 12-Dec-2017 Python-Version: 3.7 Post-History: 12-Dec-2017, 28-Dec-2017 Abstract ======== This PEP proposes a new
contextvarsmodule and a set of new CPython C APIs to support context variables. This concept is similar to thread-local storage (TLS), but, unlike TLS, it also allows correctly keeping track of values per asynchronous task, e.g.asyncio.Task. This proposal is a simplified version of :pep:550. The key difference is that this PEP is concerned only with solving the case for asynchronous tasks, not for generators. There are no proposed modifications to any built-in types or to the interpreter. This proposal is not strictly related to Python Context Managers. Although it does provide a mechanism that can be used by Context Managers to store their state. Rationale ========= Thread-local variables are insufficient for asynchronous tasks that execute concurrently in the same OS thread. Any context manager that saves and restores a context value usingthreading.local()will have its context values bleed to other code unexpectedly when used in async/await code. A few examples where having a working context local storage for asynchronous code is desirable: * Context managers likedecimalcontexts andnumpy.errstate. * Request-related data, such as security tokens and request data in web applications, language context forgettext, etc. * Profiling, tracing, and logging in large code bases. Introduction ============ The PEP proposes a new mechanism for managing context variables. The key classes involved in this mechanism arecontextvars.Contextandcontextvars.ContextVar. The PEP also proposes some policies for using the mechanism around asynchronous tasks. The proposed mechanism for accessing context variables uses theContextVarclass. A module (such asdecimal) that wishes to store a context variable should: * declare a module-global variable holding aContextVarto serve as a key; * access the current value via theget()method on the key variable; * modify the current value via theset()method on the key variable. The notion of "current value" deserves special consideration: different asynchronous tasks that exist and execute concurrently may have different values for the same key. This idea is well-known from thread-local storage but in this case the locality of the value is not necessarily bound to a thread. Instead, there is the notion of the "currentContext" which is stored in thread-local storage, and is accessed viacontextvars.copycontext()function. Manipulation of the currentContextis the responsibility of the task framework, e.g. asyncio. AContextis conceptually a read-only mapping, implemented using an immutable dictionary. TheContextVar.get()method does a lookup in the currentContextwithselfas a key, raising aLookupErroror returning a default value specified in the constructor. TheContextVar.set(value)method clones the currentContext, assigns thevalueto it withselfas a key, and sets the newContextas the new currentContext. Specification ============= A new standard library modulecontextvarsis added with the following APIs: 1.copycontext() -> Contextfunction is used to get a copy of the currentContextobject for the current OS thread. 2.ContextVarclass to declare and access context variables. 3.Contextclass encapsulates context state. Every OS thread stores a reference to its currentContextinstance. It is not possible to control that reference manually. Instead, theContext.run(callable, *args, **kwargs)method is used to run Python code in another context. contextvars.ContextVar ---------------------- TheContextVarclass has the following constructor signature:ContextVar(name, *, default=NODEFAULT). Thenameparameter is used only for introspection and debug purposes, and is exposed as a read-onlyContextVar.nameattribute. Thedefaultparameter is optional. Example:: # Declare a context variable 'var' with the default value 42. var = ContextVar('var', default=42) (TheNODEFAULTis an internal sentinel object used to detect if the default value was provided.)ContextVar.get()returns a value for context variable from the currentContext:: # Get the value ofvar. var.get()ContextVar.set(value) -> Tokenis used to set a new value for the context variable in the currentContext:: # Set the variable 'var' to 1 in the current context. var.set(1)ContextVar.reset(token)is used to reset the variable in the current context to the value it had before theset()operation that created thetoken:: assert var.get(None) is None token = var.set(1) try: ... finally: var.reset(token) assert var.get(None) is NoneContextVar.reset()method is idempotent and can be called multiple times on the same Token object: second and later calls will be no-ops. contextvars.Token -----------------contextvars.Tokenis an opaque object that should be used to restore theContextVarto its previous value, or remove it from the context if the variable was not set before. It can be created only by callingContextVar.set(). For debug and introspection purposes it has: * a read-only attributeToken.varpointing to the variable that created the token; * a read-only attributeToken.oldvalueset to the value the variable had before theset()call, or toToken.MISSINGif the variable wasn't set before. Having theContextVar.set()method returning aTokenobject and theContextVar.reset(token)method, allows context variables to be removed from the context if they were not in it before theset()call. contextvars.Context -------------------Contextobject is a mapping of context variables to values.Context()creates an empty context. To get a copy of the currentContextfor the current OS thread, use thecontextvars.copycontext()method:: ctx = contextvars.copycontext() To run Python code in someContext, useContext.run()method:: ctx.run(function) Any changes to any context variables thatfunctioncauses will be contained in thectxcontext:: var = ContextVar('var') var.set('spam') def function(): assert var.get() == 'spam' var.set('ham') assert var.get() == 'ham' ctx = copycontext() # Any changes that 'function' makes to 'var' will stay # isolated in the 'ctx'. ctx.run(function) assert var.get() == 'spam' Any changes to the context will be contained in theContextobject on whichrun()is called on.Context.run()is used to control in which context asyncio callbacks and Tasks are executed. It can also be used to run some code in a different thread in the context of the current thread:: executor = ThreadPoolExecutor() currentcontext = contextvars.copycontext() executor.submit( lambda: currentcontext.run(somefunction))Contextobjects implement thecollections.abc.MappingABC. This can be used to introspect context objects:: ctx = contextvars.copycontext() # Print all context variables and their values in 'ctx': print(ctx.items()) # Print the value of 'somevariable' in context 'ctx': print(ctx[somevariable]) asyncio -------asynciousesLoop.callsoon(),Loop.calllater(), andLoop.callat()to schedule the asynchronous execution of a function.asyncio.Taskusescallsoon()to run the wrapped coroutine. We modifyLoop.call{at,later,soon}andFuture.adddonecallback()to accept the new optional context keyword-only argument, which defaults to the current context:: def callsoon(self, callback, *args, context=None): if context is None: context = contextvars.copycontext() # ... some time later context.run(callback, *args) Tasks in asyncio need to maintain their own context that they inherit from the point they were created at.asyncio.Taskis modified as follows:: class Task: def init(self, coro): ... # Get the current context snapshot. self.context = contextvars.copycontext() self.loop.callsoon(self.step, context=self.context) def step(self, exc=None): ... # Every advance of the wrapped coroutine is done in # the task's context. self.loop.callsoon(self.step, context=self.context) ... C API ----- 1.PyContextVar * PyContextVarNew(char *name, PyObject *default): create aContextVarobject. 2. ``int PyContextVarGet(PyContextVar *, PyObject *defaultvalue, PyObject **value)``: return-1if an error occurs during the lookup,0otherwise. If a value for the context variable is found, it will be set to thevaluepointer. Otherwise,valuewill be set todefaultvaluewhen it is notNULL. IfdefaultvalueisNULL,valuewill be set to the default value of the variable, which can beNULLtoo.valueis always a borrowed reference. 3.PyContextToken * PyContextVarSet(PyContextVar *, PyObject *): set the value of the variable in the current context. 4.PyContextVarReset(PyContextVar *, PyContextToken *): reset the value of the context variable. 5.PyContext * PyContextNew(): create a new empty context. 6.PyContext * PyContextCopy(): get a copy of the current context. 7.int PyContextEnter(PyContext *)andint PyContextExit(PyContext *)allow to set and restore the context for the current OS thread. It is required to always restore the previous context:: PyContext *oldctx = PyContextCopy(); if (oldctx == NULL) goto error; if (PyContextEnter(newctx)) goto error; // run some code if (PyContextExit(oldctx)) goto error; Implementation ============== This section explains high-level implementation details in pseudo-code. Some optimizations are omitted to keep this section short and clear. For the purposes of this section, we implement an immutable dictionary usingdict.copy():: class ContextData: def init(self): self.mapping = dict() def get(self, key): return self.mapping[key] def set(self, key, value): copy = ContextData() copy.mapping = self.mapping.copy() copy.mapping[key] = value return copy def delete(self, key): copy = ContextData() copy.mapping = self.mapping.copy() del copy.mapping[key] return copy Every OS thread has a reference to the currentContextData.PyThreadStateis updated with a newcontextdatafield that points to aContextDataobject:: class PyThreadState: contextdata: ContextDatacontextvars.copycontext()is implemented as follows:: def copycontext(): ts : PyThreadState = PyThreadStateGet() if ts.contextdata is None: ts.contextdata = ContextData() ctx = Context() ctx.data = ts.contextdata return ctxcontextvars.Contextis a wrapper aroundContextData:: class Context(collections.abc.Mapping): def init(self): self.data = ContextData() def run(self, callable, *args, **kwargs): ts : PyThreadState = PyThreadStateGet() saveddata : ContextData = ts.contextdata try: ts.contextdata = self.data return callable(*args, **kwargs) finally: self.data = ts.contextdata ts.contextdata = saveddata # Mapping API methods are implemented by delegating #get()and other Mapping calls toself.data.contextvars.ContextVarinteracts withPyThreadState.contextdatadirectly:: class ContextVar: def init(self, name, *, default=NODEFAULT): self.name = name self.default = default @property def name(self): return self.name def get(self, default=NODEFAULT): ts : PyThreadState = PyThreadStateGet() data : ContextData = ts.contextdata try: return data.get(self) except KeyError: pass if default is not NODEFAULT: return default if self.default is not NODEFAULT: return self.default raise LookupError def set(self, value): ts : PyThreadState = PyThreadStateGet() data : ContextData = ts.contextdata try: oldvalue = data.get(self) except KeyError: oldvalue = Token.MISSING ts.contextdata = data.set(self, value) return Token(self, oldvalue) def reset(self, token): if token.used: return if token.oldvalue is Token.MISSING: ts.contextdata = data.delete(token.var) else: ts.contextdata = data.set(token.var, token.oldvalue) token.used = True class Token: MISSING = object() def init(self, var, oldvalue): self.var = var self.oldvalue = oldvalue self.used = False @property def var(self): return self.var @property def oldvalue(self): return self.oldvalue Implementation Notes ==================== * The internal immutable dictionary forContextis implemented using Hash Array Mapped Tries (HAMT). They allow for O(log N)setoperation, and for O(1)copycontext()function, where N is the number of items in the dictionary. For a detailed analysis of HAMT performance please refer to :pep:550[1]. *ContextVar.get()has an internal cache for the most recent value, which allows to bypass a hash lookup. This is similar to the optimization thedecimalmodule implements to retrieve its context fromPyThreadStateGetDict(). See :pep:550which explains the implementation of the cache in a great detail. Summary of the New APIs ======================= * A newcontextvarsmodule withContextVar,Context, andTokenclasses, and acopycontext()function. *asyncio.Loop.callat(),asyncio.Loop.calllater(),asyncio.Loop.callsoon(), andasyncio.Future.adddonecallback()run callback functions in the context they were called in. A new context keyword-only parameter can be used to specify a custom context. *asyncio.Taskis modified internally to maintain its own context. Design Considerations ===================== Why contextvars.Token and not ContextVar.unset()? ------------------------------------------------- The Token API allows to get around having aContextVar.unset()method, which is incompatible with chained contexts design of :pep:550. Future compatibility with :pep:550is desired (at least for Python 3.7) in case there is demand to support context variables in generators and asynchronous generators. The Token API also offers better usability: the user does not have to special-case absence of a value. Compare:: token = cv.get() try: cv.set(blah) # code finally: cv.reset(token) with:: deleted = object() old = cv.get(default=deleted) try: cv.set(blah) # code finally: if old is deleted: cv.unset() else: cv.set(old) Rejected Ideas ============== Replication of threading.local() interface ------------------------------------------ Please refer to :pep:550where this topic is covered in detail: [2]. Backwards Compatibility ======================= This proposal preserves 100% backwards compatibility. Libraries that usethreading.local()to store context-related values, currently work correctly only for synchronous code. Switching them to use the proposed API will keep their behavior for synchronous code unmodified, but will automatically enable support for asynchronous code. Reference Implementation ======================== The reference implementation can be found here: [3]. References ========== .. [1] https://www.python.org/dev/peps/pep-0550/#appendix-hamt- performance-analysis .. [2] https://www.python.org/dev/peps/pep-0550/#replication-of- threading-local-interface .. [3] https://github.com/python/cpython/pull/5027 Copyright ========= This document has been placed in the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End:
Python-Dev mailing list Python-Dev at python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/ chris.jerdonek%40gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.python.org/pipermail/python-dev/attachments/20171228/1ce5f193/attachment.html>
- Previous message (by thread): [Python-Dev] PEP 567 v2
- Next message (by thread): [Python-Dev] PEP 567 v2
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]