[Python-Dev] Informal educator feedback on PEP 572 (was Re: 2018 Python Language Summit coverage, last part) (original) (raw)

Steven D'Aprano steve at pearwood.info
Sat Jun 30 04:17:02 EDT 2018


On Wed, Jun 27, 2018 at 09:52:43PM -0700, Chris Barker wrote:

It seems everyone agrees that scoping rules should be the same for generator expressions and comprehensions,

Yes. I dislike saying "comprehensions and generator expressions" over and over again, so I just say "comprehensions".

Principle One:

Principle Two:

Principle Three:

Principle Four:

So far, there should be (I hope!) no disagreement with those first four principles. With those four principles in place, teaching and using comprehensions (genexprs) in the absense of assignment expressions does not need to change one iota.

Normal cases stay normal; weird cases mucking about with locals() inside the comprehension are already weird and won't change.

So what about:

l = [x:=i for i in range(3)] vs g = (x:=i for i in range(3)) Is there any way to keep these consistent if the "x" is in the regular local scope?

Yes. That is what closures already do.

We already have such nonlocal effects in Python 3. Move the loop inside an inner (nested) function, and then either call it immediately to simulate the effect of a list comprehension, or delay calling it to behave more like a generator expression.

Of course the runtime effects depend on whether or not the generator expression is actually evaluated. But that's no mystery, and is precisely analogous to this case:

def demo(): x = 1 def inner(): nonlocal x x = 99 inner() # call the inner function print(x)

This prints 99. But if you comment out the call to the inner function, it prints 1. I trust that doesn't come as a surprise.

Nor should this come as a surprise:

def demo(): x = 1 # assuming assignment scope is local rather than sublocal g = (x:= i for i in (99,)) L = list(g) print(x)

The value of x printed will depend on whether or not you comment out the call to list(g).

Note that this thread is titled "Informal educator feedback on PEP 572".

As an educator -- this is looking harder an harder to explain to newbies... Though easier if any assignments made in a "comprehension" don't "leak out".

Let me introduce two more principles.

Principle Five:

Principle Six:

Five is, I think, so intuitive that we forget about it in the same way that we forget about the air we breathe. It would be surprising, even shocking, if two expressions in the same context were executed in different scopes:

result = [x + 1, x - 2]

If the first x were local and the second was global, that would be disturbing. The same rule ought to apply if we include assignment expressions:

result = [(x := expr) + 1, x := x - 2]

It would be disturbing if the first assignment (x := expr) executed in the local scope, and the second (x := x - 2) failed with NameError because it was executed in the global scope.

Or worse, didn't fail with NameError, but instead returned something totally unexpected.

Now bring in a comprehension:

result = [(x := expr) + 1] + [x := x - 2 for a in (None,)]

Do you still want the x inside the comprehension to be a different x to the one outside the comprehension? How are you going to explain that UnboundLocalError to your students?

That's not actually a rhetorical question. I recognise that while Principle Five seems self-evidently desirable to me, you might consider it less important than the idea that "assignments inside comprehensions shouldn't leak".

I believe that these two expressions should give the same results even to the side-effects:

[(x := expr) + 1, x := x - 2]

[(x := expr) + 1] + [x := x - 2 for a in (None,)]

I think that is the simplest and most intuitive behaviour, the one which will be the least surprising, cause the fewest unexpected NameErrors, and be the simplest to explain.

If you still prefer the "assignments shouldn't leak" idea, consider this: under the current implementation of comprehensions as an implicit hidden function, the scope of a variable depends on where it is, violating Principle Six.

(That was the point of my introducing locals() into a previous post: to demonstrate that, today, right now, "comprehension scope" is a misnomer. Comprehensions actually execute in a hybrid of at least two scopes, the surrounding local scope and the sublocal hidden implicit function scope.)

Let me bring in another equivalency:

[(x := expr) + 1, x := x - 2]

[(x := expr) + 1] + [x := x - 2 for a in (None,)]

[(x := expr) + 1] + [a for a in (x := x - 2,)]

By Principle Six, the side-effect of assigning to x shouldn't depend on where inside the comprehension it is. The two comprehension expressions shown ought to be referring to the same "x" variable (in the same scope) regardless of whether that is the surrounding local scope, or a sublocal comprehension scope.

(In the case of it being a sublocal scope, the two comprehensions will raise UnboundLocalError.)

But -- and this is why I raised all that hoo-ha about locals() -- according to the current implementation, they don't. This version would assign to x in the sublocal scope:

# best viewed in a monospaced font
[x := x - 2 for a in (None,)]
 ^^^^^^^^^^ this is sublocal scope

but this would assign in the surrounding local scope:

[a for a in (x := x - 2,)]
            ^^^^^^^^^^^^^ this is local scope

I strongly believe that all three ought to be equivalent, including side-effects. (Remember that by Principle Two, we agree that the loop variable doesn't leak. The loop variable is invisible from the outside and doesn't count as a side-effect for this discussion.)

So here are three possibilities (assuming assignment expressions are permitted):

  1. Nick doesn't like the idea of having to inject an implicit "nonlocal" into the comprehension hidden implicit function; if we don't, that gives us the case where the scope of assignment variables depends on where they are in the comprehension, and will sometimes leak and sometimes not.

This torpedoes Princple Six, and leaves you having to explain why assignment sometimes "works" inside comprehensions and sometimes gives UnboundLocalError.

  1. If we decide that assignment inside a comprehension should always be sublocal, the implementation becomes more complex in order to bury the otherwise-local scope beneath another layer of even more hidden implicit functions.

That rules out some interesting (but probably not critical) uses of assignment expressions inside comprehensions, such as using them as a side-channel to sneak out debugging information.

And it adds a great big screaming special case to Principle Five:

  1. Or we just make all assignments inside comprehensions (including gen exprs) occur in the surrounding local scope.

Number 3 is my strong preference. It complicates the implementation a bit (needing to add some implicit nonlocals) but not as much as needing to hide the otherwise-local scope beneath another implicit function. And it gives by far the most consistent, useful and obvious semantics out of the three options.

My not-very-extensive survey on the Python-List mailing lists suggests that, if you don't ask people explicitly about "assignment expressions", they already think of the inside of comprehensions as being part of the surrounding local scope rather than a hidden inner function. So I do not believe that this will be hard to teach.

These two expressions ought to give the same result with the same side-effect:

[x := 1]

[x := a for a in (1,)]

That, I strongly believe, is the inuitive behaviour to peope who aren't immersed in the implementation details of comprehensions, as well as being the most useful.

-- Steve



More information about the Python-Dev mailing list