[Python-Dev] PEP 567 v2 (original) (raw)
Yury Selivanov yselivanov.ml at gmail.com
Thu Dec 28 01:08:13 EST 2017
- Previous message (by thread): [Python-Dev] Documenting types outside of typing
- Next message (by thread): [Python-Dev] PEP 567 v2
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
This is a second version of PEP 567.
A few things have changed:
I now have a reference implementation: https://github.com/python/cpython/pull/5027
The C API was updated to match the implementation.
The get_context() function was renamed to copy_context() to better reflect what it is really doing.
Few clarifications/edits here and there to address earlier feedback.
Yury
PEP: 567 Title: Context Variables Version: RevisionRevisionRevision Last-Modified: DateDateDate Author: Yury Selivanov <yury at magic.io> Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 12-Dec-2017 Python-Version: 3.7 Post-History: 12-Dec-2017, 28-Dec-2017
Abstract
This PEP proposes a new contextvars
module and a set of new
CPython C APIs to support context variables. This concept is
similar to thread-local storage (TLS), but, unlike TLS, it also allows
correctly keeping track of values per asynchronous task, e.g.
asyncio.Task
.
This proposal is a simplified version of :pep:550
. The key
difference is that this PEP is concerned only with solving the case
for asynchronous tasks, not for generators. There are no proposed
modifications to any built-in types or to the interpreter.
This proposal is not strictly related to Python Context Managers. Although it does provide a mechanism that can be used by Context Managers to store their state.
Rationale
Thread-local variables are insufficient for asynchronous tasks that
execute concurrently in the same OS thread. Any context manager that
saves and restores a context value using threading.local()
will
have its context values bleed to other code unexpectedly when used
in async/await code.
A few examples where having a working context local storage for asynchronous code is desirable:
Context managers like
decimal
contexts andnumpy.errstate
.Request-related data, such as security tokens and request data in web applications, language context for
gettext
, etc.Profiling, tracing, and logging in large code bases.
Introduction
The PEP proposes a new mechanism for managing context variables.
The key classes involved in this mechanism are contextvars.Context
and contextvars.ContextVar
. The PEP also proposes some policies
for using the mechanism around asynchronous tasks.
The proposed mechanism for accessing context variables uses the
ContextVar
class. A module (such as decimal
) that wishes to
store a context variable should:
declare a module-global variable holding a
ContextVar
to serve as a key;access the current value via the
get()
method on the key variable;modify the current value via the
set()
method on the key variable.
The notion of "current value" deserves special consideration:
different asynchronous tasks that exist and execute concurrently
may have different values for the same key. This idea is well-known
from thread-local storage but in this case the locality of the value is
not necessarily bound to a thread. Instead, there is the notion of the
"current Context
" which is stored in thread-local storage, and
is accessed via contextvars.copy_context()
function.
Manipulation of the current Context
is the responsibility of the
task framework, e.g. asyncio.
A Context
is conceptually a read-only mapping, implemented using
an immutable dictionary. The ContextVar.get()
method does a
lookup in the current Context
with self
as a key, raising a
LookupError
or returning a default value specified in
the constructor.
The ContextVar.set(value)
method clones the current Context
,
assigns the value
to it with self
as a key, and sets the
new Context
as the new current Context
.
Specification
A new standard library module contextvars
is added with the
following APIs:
copy_context() -> Context
function is used to get a copy of the currentContext
object for the current OS thread.ContextVar
class to declare and access context variables.Context
class encapsulates context state. Every OS thread stores a reference to its currentContext
instance. It is not possible to control that reference manually. Instead, theContext.run(callable, *args, **kwargs)
method is used to run Python code in another context.
contextvars.ContextVar
The ContextVar
class has the following constructor signature:
ContextVar(name, *, default=_NO_DEFAULT)
. The name
parameter
is used only for introspection and debug purposes, and is exposed
as a read-only ContextVar.name
attribute. The default
parameter is optional. Example::
# Declare a context variable 'var' with the default value 42.
var = ContextVar('var', default=42)
(The _NO_DEFAULT
is an internal sentinel object used to
detect if the default value was provided.)
ContextVar.get()
returns a value for context variable from the
current Context
::
# Get the value of `var`.
var.get()
ContextVar.set(value) -> Token
is used to set a new value for
the context variable in the current Context
::
# Set the variable 'var' to 1 in the current context.
var.set(1)
ContextVar.reset(token)
is used to reset the variable in the
current context to the value it had before the set()
operation
that created the token
::
assert var.get(None) is None
token = var.set(1)
try:
...
finally:
var.reset(token)
assert var.get(None) is None
ContextVar.reset()
method is idempotent and can be called
multiple times on the same Token object: second and later calls
will be no-ops.
contextvars.Token
contextvars.Token
is an opaque object that should be used to
restore the ContextVar
to its previous value, or remove it from
the context if the variable was not set before. It can be created
only by calling ContextVar.set()
.
For debug and introspection purposes it has:
a read-only attribute
Token.var
pointing to the variable that created the token;a read-only attribute
Token.old_value
set to the value the variable had before theset()
call, or toToken.MISSING
if the variable wasn't set before.
Having the ContextVar.set()
method returning a Token
object
and the ContextVar.reset(token)
method, allows context variables
to be removed from the context if they were not in it before the
set()
call.
contextvars.Context
Context
object is a mapping of context variables to values.
Context()
creates an empty context. To get a copy of the current
Context
for the current OS thread, use the
contextvars.copy_context()
method::
ctx = contextvars.copy_context()
To run Python code in some Context
, use Context.run()
method::
ctx.run(function)
Any changes to any context variables that function
causes will
be contained in the ctx
context::
var = ContextVar('var')
var.set('spam')
def function():
assert var.get() == 'spam'
var.set('ham')
assert var.get() == 'ham'
ctx = copy_context()
# Any changes that 'function' makes to 'var' will stay
# isolated in the 'ctx'.
ctx.run(function)
assert var.get() == 'spam'
Any changes to the context will be contained in the Context
object on which run()
is called on.
Context.run()
is used to control in which context asyncio
callbacks and Tasks are executed. It can also be used to run some
code in a different thread in the context of the current thread::
executor = ThreadPoolExecutor()
current_context = contextvars.copy_context()
executor.submit(
lambda: current_context.run(some_function))
Context
objects implement the collections.abc.Mapping
ABC.
This can be used to introspect context objects::
ctx = contextvars.copy_context()
# Print all context variables and their values in 'ctx':
print(ctx.items())
# Print the value of 'some_variable' in context 'ctx':
print(ctx[some_variable])
asyncio
asyncio
uses Loop.call_soon()
, Loop.call_later()
,
and Loop.call_at()
to schedule the asynchronous execution of a
function. asyncio.Task
uses call_soon()
to run the
wrapped coroutine.
We modify Loop.call_{at,later,soon}
and
Future.add_done_callback()
to accept the new optional context
keyword-only argument, which defaults to the current context::
def call_soon(self, callback, *args, context=None):
if context is None:
context = contextvars.copy_context()
# ... some time later
context.run(callback, *args)
Tasks in asyncio need to maintain their own context that they inherit
from the point they were created at. asyncio.Task
is modified
as follows::
class Task:
def __init__(self, coro):
...
# Get the current context snapshot.
self._context = contextvars.copy_context()
self._loop.call_soon(self._step, context=self._context)
def _step(self, exc=None):
...
# Every advance of the wrapped coroutine is done in
# the task's context.
self._loop.call_soon(self._step, context=self._context)
...
C API
PyContextVar * PyContextVar_New(char *name, PyObject *default)
: create aContextVar
object.int PyContextVar_Get(PyContextVar *, PyObject *default_value, PyObject **value)
: return-1
if an error occurs during the lookup,0
otherwise. If a value for the context variable is found, it will be set to thevalue
pointer. Otherwise,value
will be set todefault_value
when it is notNULL
. Ifdefault_value
isNULL
,value
will be set to the default value of the variable, which can beNULL
too.value
is always a borrowed reference.PyContextToken * PyContextVar_Set(PyContextVar *, PyObject *)
: set the value of the variable in the current context.PyContextVar_Reset(PyContextVar *, PyContextToken *)
: reset the value of the context variable.PyContext * PyContext_New()
: create a new empty context.PyContext * PyContext_Copy()
: get a copy of the current context.int PyContext_Enter(PyContext *)
andint PyContext_Exit(PyContext *)
allow to set and restore the context for the current OS thread. It is required to always restore the previous context::PyContext *old_ctx = PyContext_Copy(); if (old_ctx == NULL) goto error;
if (PyContext_Enter(new_ctx)) goto error;
// run some code
if (PyContext_Exit(old_ctx)) goto error;
Implementation
This section explains high-level implementation details in pseudo-code. Some optimizations are omitted to keep this section short and clear.
For the purposes of this section, we implement an immutable dictionary
using dict.copy()
::
class _ContextData:
def __init__(self):
self._mapping = dict()
def get(self, key):
return self._mapping[key]
def set(self, key, value):
copy = _ContextData()
copy._mapping = self._mapping.copy()
copy._mapping[key] = value
return copy
def delete(self, key):
copy = _ContextData()
copy._mapping = self._mapping.copy()
del copy._mapping[key]
return copy
Every OS thread has a reference to the current _ContextData
.
PyThreadState
is updated with a new context_data
field that
points to a _ContextData
object::
class PyThreadState:
context_data: _ContextData
contextvars.copy_context()
is implemented as follows::
def copy_context():
ts : PyThreadState = PyThreadState_Get()
if ts.context_data is None:
ts.context_data = _ContextData()
ctx = Context()
ctx._data = ts.context_data
return ctx
contextvars.Context
is a wrapper around _ContextData
::
class Context(collections.abc.Mapping):
def __init__(self):
self._data = _ContextData()
def run(self, callable, *args, **kwargs):
ts : PyThreadState = PyThreadState_Get()
saved_data : _ContextData = ts.context_data
try:
ts.context_data = self._data
return callable(*args, **kwargs)
finally:
self._data = ts.context_data
ts.context_data = saved_data
# Mapping API methods are implemented by delegating
# `get()` and other Mapping calls to `self._data`.
contextvars.ContextVar
interacts with
PyThreadState.context_data
directly::
class ContextVar:
def __init__(self, name, *, default=_NO_DEFAULT):
self._name = name
self._default = default
@property
def name(self):
return self._name
def get(self, default=_NO_DEFAULT):
ts : PyThreadState = PyThreadState_Get()
data : _ContextData = ts.context_data
try:
return data.get(self)
except KeyError:
pass
if default is not _NO_DEFAULT:
return default
if self._default is not _NO_DEFAULT:
return self._default
raise LookupError
def set(self, value):
ts : PyThreadState = PyThreadState_Get()
data : _ContextData = ts.context_data
try:
old_value = data.get(self)
except KeyError:
old_value = Token.MISSING
ts.context_data = data.set(self, value)
return Token(self, old_value)
def reset(self, token):
if token._used:
return
if token._old_value is Token.MISSING:
ts.context_data = data.delete(token._var)
else:
ts.context_data = data.set(token._var,
token._old_value)
token._used = True
class Token:
MISSING = object()
def __init__(self, var, old_value):
self._var = var
self._old_value = old_value
self._used = False
@property
def var(self):
return self._var
@property
def old_value(self):
return self._old_value
Implementation Notes
The internal immutable dictionary for
Context
is implemented using Hash Array Mapped Tries (HAMT). They allow for O(log N)set
operation, and for O(1)copy_context()
function, where N is the number of items in the dictionary. For a detailed analysis of HAMT performance please refer to :pep:550
[1]_.ContextVar.get()
has an internal cache for the most recent value, which allows to bypass a hash lookup. This is similar to the optimization thedecimal
module implements to retrieve its context fromPyThreadState_GetDict()
. See :pep:550
which explains the implementation of the cache in a great detail.
Summary of the New APIs
A new
contextvars
module withContextVar
,Context
, andToken
classes, and acopy_context()
function.asyncio.Loop.call_at()
,asyncio.Loop.call_later()
,asyncio.Loop.call_soon()
, andasyncio.Future.add_done_callback()
run callback functions in the context they were called in. A new context keyword-only parameter can be used to specify a custom context.asyncio.Task
is modified internally to maintain its own context.
Design Considerations
Why contextvars.Token and not ContextVar.unset()?
The Token API allows to get around having a ContextVar.unset()
method, which is incompatible with chained contexts design of
:pep:550
. Future compatibility with :pep:550
is desired
(at least for Python 3.7) in case there is demand to support
context variables in generators and asynchronous generators.
The Token API also offers better usability: the user does not have to special-case absence of a value. Compare::
token = cv.get()
try:
cv.set(blah)
# code
finally:
cv.reset(token)
with::
_deleted = object()
old = cv.get(default=_deleted)
try:
cv.set(blah)
# code
finally:
if old is _deleted:
cv.unset()
else:
cv.set(old)
Rejected Ideas
Replication of threading.local() interface
Please refer to :pep:550
where this topic is covered in detail: [2]_.
Backwards Compatibility
This proposal preserves 100% backwards compatibility.
Libraries that use threading.local()
to store context-related
values, currently work correctly only for synchronous code. Switching
them to use the proposed API will keep their behavior for synchronous
code unmodified, but will automatically enable support for
asynchronous code.
Reference Implementation
The reference implementation can be found here: [3]_.
References
.. [1] https://www.python.org/dev/peps/pep-0550/#appendix-hamt-performance-analysis
.. [2] https://www.python.org/dev/peps/pep-0550/#replication-of-threading-local-interface
.. [3] https://github.com/python/cpython/pull/5027
Copyright
This document has been placed in the public domain.
.. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End:
- Previous message (by thread): [Python-Dev] Documenting types outside of typing
- Next message (by thread): [Python-Dev] PEP 567 v2
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]