[Python-Dev] PEP 550 v4 (original) (raw)
Yury Selivanov yselivanov.ml at gmail.com
Fri Aug 25 18:32:22 EDT 2017
- Previous message (by thread): [Python-Dev] Scope, not context? (was Re: PEP 550 v3 naming)
- Next message (by thread): [Python-Dev] PEP 550 v4
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hi,
This is the 4th iteration of the PEP that Elvis and I have rewritten from scratch.
The specification section has been separated from the implementation section, which makes them easier to follow.
During the rewrite, we realized that generators and coroutines should work with the EC in exactly the same way (coroutines used to be created with no LC in prior versions of the PEP).
We also renamed Context Keys to Context Variables which seems to be a more appropriate name.
Hopefully this update will resolve the remaining questions about the specification and the proposed implementation, and will allow us to focus on refining the API.
Yury
PEP: 550 Title: Execution Context Version: RevisionRevisionRevision Last-Modified: DateDateDate Author: Yury Selivanov <yury at magic.io>, Elvis Pranskevichus <elvis at magic.io> Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 11-Aug-2017 Python-Version: 3.7 Post-History: 11-Aug-2017, 15-Aug-2017, 18-Aug-2017, 25-Aug-2017
Abstract
This PEP adds a new generic mechanism of ensuring consistent access to non-local state in the context of out-of-order execution, such as in Python generators and coroutines.
Thread-local storage, such as threading.local()
, is inadequate for
programs that execute concurrently in the same OS thread. This PEP
proposes a solution to this problem.
Rationale
Prior to the advent of asynchronous programming in Python, programs
used OS threads to achieve concurrency. The need for thread-specific
state was solved by threading.local()
and its C-API equivalent,
PyThreadState_GetDict()
.
A few examples of where Thread-local storage (TLS) is commonly relied upon:
Context managers like decimal contexts,
numpy.errstate
, andwarnings.catch_warnings
.Request-related data, such as security tokens and request data in web applications, language context for
gettext
etc.Profiling, tracing, and logging in large code bases.
Unfortunately, TLS does not work well for programs which execute concurrently in a single thread. A Python generator is the simplest example of a concurrent program. Consider the following::
def fractions(precision, x, y):
with decimal.localcontext() as ctx:
ctx.prec = precision
yield Decimal(x) / Decimal(y)
yield Decimal(x) / Decimal(y**2)
g1 = fractions(precision=2, x=1, y=3)
g2 = fractions(precision=6, x=2, y=3)
items = list(zip(g1, g2))
The expected value of items
is::
[(Decimal('0.33'), Decimal('0.666667')),
(Decimal('0.11'), Decimal('0.222222'))]
Rather surprisingly, the actual result is::
[(Decimal('0.33'), Decimal('0.666667')),
(Decimal('0.111111'), Decimal('0.222222'))]
This is because Decimal context is stored as a thread-local, so
concurrent iteration of the fractions()
generator would corrupt
the state. A similar problem exists with coroutines.
Applications also often need to associate certain data with a given thread of execution. For example, a web application server commonly needs access to the current HTTP request object.
The inadequacy of TLS in asynchronous code has lead to the proliferation of ad-hoc solutions, which are limited in scope and do not support all required use cases.
The current status quo is that any library (including the standard library), which relies on TLS, is likely to be broken when used in asynchronous code or with generators (see [3]_ as an example issue.)
Some languages, that support coroutines or generators, recommend
passing the context manually as an argument to every function, see [1]_
for an example. This approach, however, has limited use for Python,
where there is a large ecosystem that was built to work with a TLS-like
context. Furthermore, libraries like decimal
or numpy
rely
on context implicitly in overloaded operator implementations.
The .NET runtime, which has support for async/await, has a generic
solution for this problem, called ExecutionContext
(see [2]_).
Goals
The goal of this PEP is to provide a more reliable
threading.local()
alternative, which:
provides the mechanism and the API to fix non-local state issues with coroutines and generators;
has no or negligible performance impact on the existing code or the code that will be using the new mechanism, including libraries like
decimal
andnumpy
.
High-Level Specification
The full specification of this PEP is broken down into three parts:
High-Level Specification (this section): the description of the overall solution. We show how it applies to generators and coroutines in user code, without delving into implementation details.
Detailed Specification: the complete description of new concepts, APIs, and related changes to the standard library.
Implementation Details: the description and analysis of data structures and algorithms used to implement this PEP, as well as the necessary changes to CPython.
For the purpose of this section, we define execution context as an opaque container of non-local state that allows consistent access to its contents in the concurrent execution environment.
A context variable is an object representing a value in the
execution context. A new context variable is created by calling
the new_context_var()
function. A context variable object has
two methods:
lookup()
: returns the value of the variable in the current execution context;set()
: sets the value of the variable in the current execution context.
Regular Single-threaded Code
In regular, single-threaded code that doesn't involve generators or coroutines, context variables behave like globals::
var = new_context_var()
def sub():
assert var.lookup() == 'main'
var.set('sub')
def main():
var.set('main')
sub()
assert var.lookup() == 'sub'
Multithreaded Code
In multithreaded code, context variables behave like thread locals::
var = new_context_var()
def sub():
assert var.lookup() is None # The execution context is empty
# for each new thread.
var.set('sub')
def main():
var.set('main')
thread = threading.Thread(target=sub)
thread.start()
thread.join()
assert var.lookup() == 'main'
Generators
In generators, changes to context variables are local and are not visible to the caller, but are visible to the code called by the generator. Once set in the generator, the context variable is guaranteed not to change between iterations::
var = new_context_var()
def gen():
var.set('gen')
assert var.lookup() == 'gen'
yield 1
assert var.lookup() == 'gen'
yield 2
def main():
var.set('main')
g = gen()
next(g)
assert var.lookup() == 'main'
var.set('main modified')
next(g)
assert var.lookup() == 'main modified'
Changes to caller's context variables are visible to the generator (unless they were also modified inside the generator)::
var = new_context_var()
def gen():
assert var.lookup() == 'var'
yield 1
assert var.lookup() == 'var modified'
yield 2
def main():
g = gen()
var.set('var')
next(g)
var.set('var modified')
next(g)
Now, let's revisit the decimal precision example from the Rationale
_
section, and see how the execution context can improve the situation::
import decimal
decimal_prec = new_context_var() # create a new context variable
# Pre-PEP 550 Decimal relies on TLS for its context.
# This subclass switches the decimal context storage
# to the execution context for illustration purposes.
#
class MyDecimal(decimal.Decimal):
def __init__(self, value="0"):
prec = decimal_prec.lookup()
if prec is None:
raise ValueError('could not find decimal precision')
context = decimal.Context(prec=prec)
super().__init__(value, context=context)
def fractions(precision, x, y):
# Normally, this would be set by a context manager,
# but for simplicity we do this directly.
decimal_prec.set(precision)
yield MyDecimal(x) / MyDecimal(y)
yield MyDecimal(x) / MyDecimal(y**2)
g1 = fractions(precision=2, x=1, y=3)
g2 = fractions(precision=6, x=2, y=3)
items = list(zip(g1, g2))
The value of items
is::
[(Decimal('0.33'), Decimal('0.666667')),
(Decimal('0.11'), Decimal('0.222222'))]
which matches the expected result.
Coroutines and Asynchronous Tasks
In coroutines, like in generators, context variable changes are local and are not visible to the caller::
import asyncio
var = new_context_var()
async def sub():
assert var.lookup() == 'main'
var.set('sub')
assert var.lookup() == 'sub'
async def main():
var.set('main')
await sub()
assert var.lookup() == 'main'
loop = asyncio.get_event_loop()
loop.run_until_complete(main())
To establish the full semantics of execution context in couroutines,
we must also consider tasks. A task is the abstraction used by
asyncio, and other similar libraries, to manage the concurrent
execution of coroutines. In the example above, a task is created
implicitly by the run_until_complete()
function.
asyncio.wait_for()
is another example of implicit task creation::
async def sub():
await asyncio.sleep(1)
assert var.lookup() == 'main'
async def main():
var.set('main')
# waiting for sub() directly
await sub()
# waiting for sub() with a timeout
await asyncio.wait_for(sub(), timeout=2)
var.set('main changed')
Intuitively, we expect the assertion in sub()
to hold true in both
invocations, even though the wait_for()
implementation actually
spawns a task, which runs sub()
concurrently with main()
.
Thus, tasks must capture a snapshot of the current execution
context at the moment of their creation and use it to execute the
wrapped coroutine whenever that happens. If this is not done, then
innocuous looking changes like wrapping a coroutine in a wait_for()
call would cause surprising breakage. This leads to the following::
import asyncio
var = new_context_var()
async def sub():
# Sleeping will make sub() run after
# `var` is modified in main().
await asyncio.sleep(1)
assert var.lookup() == 'main'
async def main():
var.set('main')
loop.create_task(sub()) # schedules asynchronous execution
# of sub().
assert var.lookup() == 'main'
var.set('main changed')
loop = asyncio.get_event_loop()
loop.run_until_complete(main())
In the above code we show how sub()
, running in a separate task,
sees the value of var
as it was when loop.create_task(sub())
was called.
Like tasks, the intuitive behaviour of callbacks scheduled with either
Loop.call_soon()
, Loop.call_later()
, or
Future.add_done_callback()
is to also capture a snapshot of the
current execution context at the point of scheduling, and use it to
run the callback::
current_request = new_context_var()
def log_error(e):
logging.error('error when handling request %r',
current_request.lookup())
async def render_response():
...
async def handle_get_request(request):
current_request.set(request)
try:
return await render_response()
except Exception as e:
get_event_loop().call_soon(log_error, e)
return '500 - Internal Server Error'
Detailed Specification
Conceptually, an execution context (EC) is a stack of logical contexts. There is one EC per Python thread.
A logical context (LC) is a mapping of context variables to their values in that particular LC.
A context variable is an object representing a value in the
execution context. A new context variable object is created by calling
the sys.new_context_var(name: str)
function. The value of the
name
argument is not used by the EC machinery, but may be used for
debugging and introspection.
The context variable object has the following methods and attributes:
name
: the value passed tonew_context_var()
.lookup()
: traverses the execution context top-to-bottom, until the variable value is found. ReturnsNone
, if the variable is not present in the execution context;set()
: sets the value of the variable in the topmost logical context.
Generators
When created, each generator object has an empty logical context object
stored in its __logical_context__
attribute. This logical context
is pushed onto the execution context at the beginning of each generator
iteration and popped at the end::
var1 = sys.new_context_var('var1')
var2 = sys.new_context_var('var2')
def gen():
var1.set('var1-gen')
var2.set('var2-gen')
# EC = [
# outer_LC(),
# gen_LC({var1: 'var1-gen', var2: 'var2-gen'})
# ]
n = nested_gen() # nested_gen_LC is created
next(n)
# EC = [
# outer_LC(),
# gen_LC({var1: 'var1-gen', var2: 'var2-gen'})
# ]
var1.set('var1-gen-mod')
var2.set('var2-gen-mod')
# EC = [
# outer_LC(),
# gen_LC({var1: 'var1-gen-mod', var2: 'var2-gen-mod'})
# ]
next(n)
def nested_gen():
# EC = [
# outer_LC(),
# gen_LC({var1: 'var1-gen', var2: 'var2-gen'}),
# nested_gen_LC()
# ]
assert var1.lookup() == 'var1-gen'
assert var2.lookup() == 'var2-gen'
var1.set('var1-nested-gen')
# EC = [
# outer_LC(),
# gen_LC({var1: 'var1-gen', var2: 'var2-gen'}),
# nested_gen_LC({var1: 'var1-nested-gen'})
# ]
yield
# EC = [
# outer_LC(),
# gen_LC({var1: 'var1-gen-mod', var2: 'var2-gen-mod'}),
# nested_gen_LC({var1: 'var1-nested-gen'})
# ]
assert var1.lookup() == 'var1-nested-gen'
assert var2.lookup() == 'var2-gen-mod'
yield
# EC = [outer_LC()]
g = gen() # gen_LC is created for the generator object `g`
list(g)
# EC = [outer_LC()]
The snippet above shows the state of the execution context stack throughout the generator lifespan.
contextlib.contextmanager
Earlier, we've used the following example::
import decimal
# create a new context variable
decimal_prec = sys.new_context_var('decimal_prec')
# ...
def fractions(precision, x, y):
decimal_prec.set(precision)
yield MyDecimal(x) / MyDecimal(y)
yield MyDecimal(x) / MyDecimal(y**2)
Let's extend it by adding a context manager::
@contextlib.contextmanager
def precision_context(prec):
old_rec = decimal_prec.lookup()
try:
decimal_prec.set(prec)
yield
finally:
decimal_prec.set(old_prec)
Unfortunately, this would not work straight away, as the modification
to the decimal_prec
variable is contained to the
precision_context()
generator, and therefore will not be visible
inside the with
block::
def fractions(precision, x, y):
# EC = [{}, {}]
with precision_context(precision):
# EC becomes [{}, {}, {decimal_prec: precision}] in the
# *precision_context()* generator,
# but here the EC is still [{}, {}]
# raises ValueError('could not find decimal precision')!
yield MyDecimal(x) / MyDecimal(y)
yield MyDecimal(x) / MyDecimal(y**2)
The way to fix this is to set the generator's __logical_context__
attribute to None
. This will cause the generator to avoid
modifying the execution context stack.
We modify the contextlib.contextmanager()
decorator to
set genobj.__logical_context__
to None
to produce
well-behaved context managers::
def fractions(precision, x, y):
# EC = [{}, {}]
with precision_context(precision):
# EC = [{}, {decimal_prec: precision}]
yield MyDecimal(x) / MyDecimal(y)
yield MyDecimal(x) / MyDecimal(y**2)
# EC becomes [{}, {decimal_prec: None}]
asyncio
asyncio
uses Loop.call_soon
, Loop.call_later
,
and Loop.call_at
to schedule the asynchronous execution of a
function. asyncio.Task
uses call_soon()
to further the
execution of the wrapped coroutine.
We modify Loop.call_{at,later,soon}
to accept the new
optional execution_context keyword argument, which defaults to
the copy of the current execution context::
def call_soon(self, callback, *args, execution_context=None):
if execution_context is None:
execution_context = sys.get_execution_context()
# ... some time later
sys.run_with_execution_context(
execution_context, callback, args)
The sys.get_execution_context()
function returns a shallow copy
of the current execution context. By shallow copy here we mean such
a new execution context that:
- lookups in the copy provide the same results as in the original execution context, and
- any changes in the original execution context do not affect the copy, and
- any changes to the copy do not affect the original execution context.
Either of the following satisfy the copy requirements:
- a new stack with shallow copies of logical contexts;
- a new stack with one squashed logical context.
The sys.run_with_execution_context(ec, func, *args, **kwargs)
function runs func(*args, **kwargs)
with ec as the execution
context. The function performs the following steps:
- Set ec as the current execution context stack in the current thread.
- Push an empty logical context onto the stack.
- Run
func(*args, **kwargs)
. - Pop the logical context from the stack.
- Restore the original execution context stack.
- Return or raise the
func()
result.
These steps ensure that ec cannot be modified by func,
which makes run_with_execution_context()
idempotent.
asyncio.Task
is modified as follows::
class Task:
def __init__(self, coro):
...
# Get the current execution context snapshot.
self._exec_context = sys.get_execution_context()
self._loop.call_soon(
self._step,
execution_context=self._exec_context)
def _step(self, exc=None):
...
self._loop.call_soon(
self._step,
execution_context=self._exec_context)
...
Generators Transformed into Iterators
Any Python generator can be represented as an equivalent iterator. Compilers like Cython rely on this axiom. With respect to the execution context, such iterator should behave the same way as the generator it represents.
This means that there needs to be a Python API to create new logical contexts and run code with a given logical context.
The sys.new_logical_context()
function creates a new empty
logical context.
The sys.run_with_logical_context(lc, func, *args, **kwargs)
function can be used to run functions in the specified logical context.
The lc can be modified as a result of the call.
The sys.run_with_logical_context()
function performs the following
steps:
- Push lc onto the current execution context stack.
- Run
func(*args, **kwargs)
. - Pop lc from the execution context stack.
- Return or raise the
func()
result.
By using new_logical_context()
and run_with_logical_context()
,
we can replicate the generator behaviour like this::
class Generator:
def __init__(self):
self.logical_context = sys.new_logical_context()
def __iter__(self):
return self
def __next__(self):
return sys.run_with_logical_context(
self.logical_context, self._next_impl)
def _next_impl(self):
# Actual __next__ implementation.
...
Let's see how this pattern can be applied to a real generator::
# create a new context variable
decimal_prec = sys.new_context_var('decimal_precision')
def gen_series(n, precision):
decimal_prec.set(precision)
for i in range(1, n):
yield MyDecimal(i) / MyDecimal(3)
# gen_series is equivalent to the following iterator:
class Series:
def __init__(self, n, precision):
# Create a new empty logical context on creation,
# like the generators do.
self.logical_context = sys.new_logical_context()
# run_with_logical_context() will pushes
# self.logical_context onto the execution context stack,
# runs self._next_impl, and pops self.logical_context
# from the stack.
return sys.run_with_logical_context(
self.logical_context, self._init, n, precision)
def _init(self, n, precision):
self.i = 1
self.n = n
decimal_prec.set(precision)
def __iter__(self):
return self
def __next__(self):
return sys.run_with_logical_context(
self.logical_context, self._next_impl)
def _next_impl(self):
decimal_prec.set(self.precision)
result = MyDecimal(self.i) / MyDecimal(3)
self.i += 1
return result
For regular iterators such approach to logical context management is
normally not necessary, and it is recommended to set and restore
context variables directly in __next__
::
class Series:
def __next__(self):
old_prec = decimal_prec.lookup()
try:
decimal_prec.set(self.precision)
...
finally:
decimal_prec.set(old_prec)
Asynchronous Generators
The execution context semantics in asynchronous generators does not differ from that of regular generators and coroutines.
Implementation
Execution context is implemented as an immutable linked list of logical contexts, where each logical context is an immutable weak key mapping. A pointer to the currently active execution context is stored in the OS thread state::
+-----------------+
| | ec
| PyThreadState +-------------+
| | |
+-----------------+ |
|
ec_node ec_node ec_node v
+------+------+ +------+------+ +------+------+
| NULL | lc |<----| prev | lc |<----| prev | lc |
+------+--+---+ +------+--+---+ +------+--+---+
| | |
LC v LC v LC v
+-------------+ +-------------+ +-------------+
| var1: obj1 | | EMPTY | | var1: obj4 |
| var2: obj2 | +-------------+ +-------------+
| var3: obj3 |
+-------------+
The choice of the immutable list of immutable mappings as a fundamental
data structure is motivated by the need to efficiently implement
sys.get_execution_context()
, which is to be frequently used by
asynchronous tasks and callbacks. When the EC is immutable,
get_execution_context()
can simply copy the current execution
context by reference::
def get_execution_context(self):
return PyThreadState_Get().ec
Let's review all possible context modification scenarios:
The
ContextVariable.set()
method is called::def ContextVar_set(self, val): # See a more complete set() definition # in the
Context Variables
section. tstate = PyThreadState_Get() top_ec_node = tstate.ec top_lc = top_ec_node.lc new_top_lc = top_lc.set(self, val) tstate.ec = ec_node( prev=top_ec_node.prev, lc=new_top_lc)The
sys.run_with_logical_context()
is called, in which case the passed logical context object is appended to the execution context::def run_with_logical_context(lc, func, *args, **kwargs): tstate = PyThreadState_Get() old_top_ec_node = tstate.ec new_top_ec_node = ec_node(prev=old_top_ec_node, lc=lc) try: tstate.ec = new_top_ec_node return func(*args, **kwargs) finally: tstate.ec = old_top_ec_node
The
sys.run_with_execution_context()
is called, in which case the current execution context is set to the passed execution context with a new empty logical context appended to it::def run_with_execution_context(ec, func, *args, **kwargs): tstate = PyThreadState_Get() old_top_ec_node = tstate.ec new_lc = sys.new_logical_context() new_top_ec_node = ec_node(prev=ec, lc=new_lc) try: tstate.ec = new_top_ec_node return func(*args, **kwargs) finally: tstate.ec = old_top_ec_node
Either
genobj.send()
,genobj.throw()
,genobj.close()
are called on agenobj
generator, in which case the logical context recorded ingenobj
is pushed onto the stack::PyGen_New(PyGenObject *gen): gen.logical_context = sys.new_logical_context() gen_send(PyGenObject *gen, ...): tstate = PyThreadState_Get() if gen.logical_context is not None: old_top_ec_node = tstate.ec new_top_ec_node = ec_node( prev=old_top_ec_node, lc=gen.logical_context) try: tstate.ec = new_top_ec_node return _gen_send_impl(gen, ...) finally: gen.logical_context = tstate.ec.lc tstate.ec = old_top_ec_node else: return _gen_send_impl(gen, ...)
Coroutines and asynchronous generators share the implementation with generators, and the above changes apply to them as well.
In certain scenarios the EC may need to be squashed to limit the size of the chain. For example, consider the following corner case::
async def repeat(coro, delay):
await coro()
await asyncio.sleep(delay)
loop.create_task(repeat(coro, delay))
async def ping():
print('ping')
loop = asyncio.get_event_loop()
loop.create_task(repeat(ping, 1))
loop.run_forever()
In the above code, the EC chain will grow as long as repeat()
is
called. Each new task will call sys.run_in_execution_context()
,
which will append a new logical context to the chain. To prevent
unbounded growth, sys.get_execution_context()
checks if the chain
is longer than a predetermined maximum, and if it is, squashes the
chain into a single LC::
def get_execution_context():
tstate = PyThreadState_Get()
if tstate.ec_len > EC_LEN_MAX:
squashed_lc = sys.new_logical_context()
ec_node = tstate.ec
while ec_node:
# The LC.merge() method does not replace existing keys.
squashed_lc = squashed_lc.merge(ec_node.lc)
ec_node = ec_node.prev
return ec_node(prev=NULL, lc=squashed_lc)
else:
return tstate.ec
Logical Context
Logical context is an immutable weak key mapping which has the following properties with respect to garbage collection:
ContextVar
objects are strongly-referenced only from the application code, not from any of the Execution Context machinery or values they point to. This means that there are no reference cycles that could extend their lifespan longer than necessary, or prevent their collection by the GC.Values put in the Execution Context are guaranteed to be kept alive while there is a
ContextVar
key referencing them in the thread.If a
ContextVar
is garbage collected, all of its values will be removed from all contexts, allowing them to be GCed if needed.If a thread has ended its execution, its thread state will be cleaned up along with its
ExecutionContext
, cleaning up all values bound to all context variables in the thread.
As discussed earluier, we need sys.get_execution_context()
to be
consistently fast regardless of the size of the execution context, so
logical context is necessarily an immutable mapping.
Choosing dict
for the underlying implementation is suboptimal,
because LC.set()
will cause dict.copy()
, which is an O(N)
operation, where N is the number of items in the LC.
get_execution_context()
, when squashing the EC, is a O(M)
operation, where M is the total number of context variable values
in the EC.
So, instead of dict
, we choose Hash Array Mapped Trie (HAMT)
as the underlying implementation of logical contexts. (Scala and
Clojure use HAMT to implement high performance immutable collections
[5], [6].)
With HAMT .set()
becomes an O(log N) operation, and
get_execution_context()
squashing is more efficient on average due
to structural sharing in HAMT.
See Appendix: HAMT Performance Analysis
_ for a more elaborate
analysis of HAMT performance compared to dict
.
Context Variables
The ContextVar.lookup()
and ContextVar.set()
methods are
implemented as follows (in pseudo-code)::
class ContextVar:
def get(self):
tstate = PyThreadState_Get()
ec_node = tstate.ec
while ec_node:
if self in ec_node.lc:
return ec_node.lc[self]
ec_node = ec_node.prev
return None
def set(self, value):
tstate = PyThreadState_Get()
top_ec_node = tstate.ec
if top_ec_node is not None:
top_lc = top_ec_node.lc
new_top_lc = top_lc.set(self, value)
tstate.ec = ec_node(
prev=top_ec_node.prev,
lc=new_top_lc)
else:
top_lc = sys.new_logical_context()
new_top_lc = top_lc.set(self, value)
tstate.ec = ec_node(
prev=NULL,
lc=new_top_lc)
For efficient access in performance-sensitive code paths, such as in
numpy
and decimal
, we add a cache to ContextVar.get()
,
making it an O(1) operation when the cache is hit. The cache key is
composed from the following:
The new
uint64_t PyThreadState->unique_id
, which is a globally unique thread state identifier. It is computed from the newuint64_t PyInterpreterState->ts_counter
, which is incremented whenever a new thread state is created.The
uint64_t ContextVar->version
counter, which is incremented whenever the context variable value is changed in any logical context in any thread.
The cache is then implemented as follows::
class ContextVar:
def set(self, value):
... # implementation
self.version += 1
def get(self):
tstate = PyThreadState_Get()
if (self.last_tstate_id == tstate.unique_id and
self.last_version == self.version):
return self.last_value
value = self._get_uncached()
self.last_value = value # borrowed ref
self.last_tstate_id = tstate.unique_id
self.last_version = self.version
return value
Note that last_value
is a borrowed reference. The assumption
is that if the version checks are fine, the object will be alive.
This allows the values of context variables to be properly garbage
collected.
This generic caching approach is similar to what the current C
implementation of decimal
does to cache the the current decimal
context, and has similar performance characteristics.
Performance Considerations
Tests of the reference implementation based on the prior revisions of this PEP have shown 1-2% slowdown on generator microbenchmarks and no noticeable difference in macrobenchmarks.
The performance of non-generator and non-async code is not affected by this PEP.
Summary of the New APIs
Python
The following new Python APIs are introduced by this PEP:
The
sys.new_context_var(name: str='...')
function to createContextVar
objects.The
ContextVar
object, which has:- the read-only
.name
attribute, - the
.lookup()
method which returns the value of the variable in the current execution context; - the
.set()
method which sets the value of the variable in the current execution context.
- the read-only
The
sys.get_execution_context()
function, which returns a copy of the current execution context.The
sys.new_execution_context()
function, which returns a new empty execution context.The
sys.new_logical_context()
function, which returns a new empty logical context.The
sys.run_with_execution_context(ec: ExecutionContext, func, *args, **kwargs)
function, which runs func with the provided execution context.The
sys.run_with_logical_context(lc:LogicalContext, func, *args, **kwargs)
function, which runs func with the provided logical context on top of the current execution context.
C API
PyContextVar * PyContext_NewVar(char *desc)
: create aPyContextVar
object.PyObject * PyContext_LookupVar(PyContextVar *)
: return the value of the variable in the current execution context.int PyContext_SetVar(PyContextVar *, PyObject *)
: set the value of the variable in the current execution context.PyLogicalContext * PyLogicalContext_New()
: create a new emptyPyLogicalContext
.PyLogicalContext * PyExecutionContext_New()
: create a new emptyPyExecutionContext
.PyExecutionContext * PyExecutionContext_Get()
: return the current execution context.int PyExecutionContext_Set(PyExecutionContext *)
: set the passed EC object as the current for the active thread state.int PyExecutionContext_SetWithLogicalContext(PyExecutionContext *, PyLogicalContext *)
: allows to implementsys.run_with_logical_context
Python API.
Design Considerations
Should PyThreadState_GetDict()
use the execution context?
No. PyThreadState_GetDict
is based on TLS, and changing its
semantics will break backwards compatibility.
PEP 521
:pep:521
proposes an alternative solution to the problem, which
extends the context manager protocol with two new methods:
__suspend__()
and __resume__()
. Similarly, the asynchronous
context manager protocol is also extended with __asuspend__()
and
__aresume__()
.
This allows implementing context managers that manage non-local state, which behave correctly in generators and coroutines.
For example, consider the following context manager, which uses execution state::
class Context:
def __init__(self):
self.var = new_context_var('var')
def __enter__(self):
self.old_x = self.var.lookup()
self.var.set('something')
def __exit__(self, *err):
self.var.set(self.old_x)
An equivalent implementation with PEP 521::
local = threading.local()
class Context:
def __enter__(self):
self.old_x = getattr(local, 'x', None)
local.x = 'something'
def __suspend__(self):
local.x = self.old_x
def __resume__(self):
local.x = 'something'
def __exit__(self, *err):
local.x = self.old_x
The downside of this approach is the addition of significant new complexity to the context manager protocol and the interpreter implementation. This approach is also likely to negatively impact the performance of generators and coroutines.
Additionally, the solution in :pep:521
is limited to context managers,
and does not provide any mechanism to propagate state in asynchronous
tasks and callbacks.
Can Execution Context be implemented outside of CPython?
No. Proper generator behaviour with respect to the execution context requires changes to the interpreter.
Should we update sys.displayhook and other APIs to use EC?
APIs like redirecting stdout by overwriting sys.stdout
, or
specifying new exception display hooks by overwriting the
sys.displayhook
function are affecting the whole Python process
by design. Their users assume that the effect of changing
them will be visible across OS threads. Therefore we cannot
just make these APIs to use the new Execution Context.
That said we think it is possible to design new APIs that will be context aware, but that is outside of the scope of this PEP.
Greenlets
Greenlet is an alternative implementation of cooperative scheduling for Python. Although greenlet package is not part of CPython, popular frameworks like gevent rely on it, and it is important that greenlet can be modified to support execution contexts.
Conceptually, the behaviour of greenlets is very similar to that of generators, which means that similar changes around greenlet entry and exit can be done to add support for execution context.
Backwards Compatibility
This proposal preserves 100% backwards compatibility.
Appendix: HAMT Performance Analysis
.. figure:: pep-0550-hamt_vs_dict-v2.png :align: center :width: 100%
Figure 1. Benchmark code can be found here: [9]_.
The above chart demonstrates that:
HAMT displays near O(1) performance for all benchmarked dictionary sizes.
dict.copy()
becomes very slow around 100 items.
.. figure:: pep-0550-lookup_hamt.png :align: center :width: 100%
Figure 2. Benchmark code can be found here: [10]_.
Figure 2 compares the lookup costs of dict
versus a HAMT-based
immutable mapping. HAMT lookup time is 30-40% slower than Python dict
lookups on average, which is a very good result, considering that the
latter is very well optimized.
Thre is research [8]_ showing that there are further possible improvements to the performance of HAMT.
The reference implementation of HAMT for CPython can be found here: [7]_.
Acknowledgments
Thanks to Victor Petrovykh for countless discussions around the topic and PEP proofreading and edits.
Thanks to Nathaniel Smith for proposing the ContextVar
design
[17]_ [18]_, for pushing the PEP towards a more complete design, and
coming up with the idea of having a stack of contexts in the thread
state.
Thanks to Nick Coghlan for numerous suggestions and ideas on the mailing list, and for coming up with a case that cause the complete rewrite of the initial PEP version [19]_.
Version History
Initial revision, posted on 11-Aug-2017 [20]_.
V2 posted on 15-Aug-2017 [21]_.
The fundamental limitation that caused a complete redesign of the first version was that it was not possible to implement an iterator that would interact with the EC in the same way as generators (see [19]_.)
Version 2 was a complete rewrite, introducing new terminology (Local Context, Execution Context, Context Item) and new APIs.
V3 posted on 18-Aug-2017 [22]_.
Updates:
Local Context was renamed to Logical Context. The term "local" was ambiguous and conflicted with local name scopes.
Context Item was renamed to Context Key, see the thread with Nick Coghlan, Stefan Krah, and Yury Selivanov [23]_ for details.
Context Item get cache design was adjusted, per Nathaniel Smith's idea in [25]_.
Coroutines are created without a Logical Context; ceval loop no longer needs to special case the
await
expression (proposed by Nick Coghlan in [24]_.)
V4 posted on 25-Aug-2017: the current version.
The specification section has been completely rewritten.
Context Key renamed to Context Var.
Removed the distinction between generators and coroutines with respect to logical context isolation.
References
.. [1] https://blog.golang.org/context
.. [2] https://msdn.microsoft.com/en-us/library/system.threading.executioncontext.aspx
.. [3] https://github.com/numpy/numpy/issues/9444
.. [4] http://bugs.python.org/issue31179
.. [5] https://en.wikipedia.org/wiki/Hash_array_mapped_trie
.. [6] http://blog.higher-order.net/2010/08/16/assoc-and-clojures-persistenthashmap-part-ii.html
.. [7] https://github.com/1st1/cpython/tree/hamt
.. [8] https://michael.steindorfer.name/publications/oopsla15.pdf
.. [9] https://gist.github.com/1st1/9004813d5576c96529527d44c5457dcd
.. [10] https://gist.github.com/1st1/dbe27f2e14c30cce6f0b5fddfc8c437e
.. [11] https://github.com/1st1/cpython/tree/pep550
.. [12] https://www.python.org/dev/peps/pep-0492/#async-await
.. [13] https://github.com/MagicStack/uvloop/blob/master/examples/bench/echoserver.py
.. [14] https://github.com/MagicStack/pgbench
.. [15] https://github.com/python/performance
.. [16] https://gist.github.com/1st1/6b7a614643f91ead3edf37c4451a6b4c
.. [17] https://mail.python.org/pipermail/python-ideas/2017-August/046752.html
.. [18] https://mail.python.org/pipermail/python-ideas/2017-August/046772.html
.. [19] https://mail.python.org/pipermail/python-ideas/2017-August/046775.html
.. [20] https://github.com/python/peps/blob/e8a06c9a790f39451d9e99e203b13b3ad73a1d01/pep-0550.rst
.. [21] https://github.com/python/peps/blob/e3aa3b2b4e4e9967d28a10827eed1e9e5960c175/pep-0550.rst
.. [22] https://github.com/python/peps/blob/287ed87bb475a7da657f950b353c71c1248f67e7/pep-0550.rst
.. [23] https://mail.python.org/pipermail/python-ideas/2017-August/046801.html
.. [24] https://mail.python.org/pipermail/python-ideas/2017-August/046790.html
.. [25] https://mail.python.org/pipermail/python-ideas/2017-August/046786.html
Copyright
This document has been placed in the public domain.
- Previous message (by thread): [Python-Dev] Scope, not context? (was Re: PEP 550 v3 naming)
- Next message (by thread): [Python-Dev] PEP 550 v4
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]