[Python-Dev] PEP 572: Assignment Expressions (original) (raw)

Chris Angelico rosuav at gmail.com
Wed Apr 18 14:17:40 EDT 2018


On Thu, Apr 19, 2018 at 2:18 AM, Guido van Rossum <guido at python.org> wrote:

On Wed, Apr 18, 2018 at 7:35 AM, Chris Angelico <rosuav at gmail.com> wrote:

On Wed, Apr 18, 2018 at 11:58 PM, Guido van Rossum <guido at python.org> wrote: > I can't tell from this what the PEP actually says should happen in that > example. When I first saw it I thought "Gaah! What a horrible piece of > code." But it works today, and people's code will break if we change > its > meaning. > > However we won't have to break that. Suppose the code is (perversely) > > t = range(3) > a = [t for t in t if t] > > If we translate this to > > t = range(3) > def listcomp(t=t): > a = [] > for t in t: > if t: > a.append(t) > return a > a = listcomp() > > Then it will still work. The trick will be to recognize "imported" names > that are also assigned and capture those (as well as other captures as > already described in the PEP). That can be done. However, this form of importing will have one of two consequences: 1) Referencing an unbound name will scan to outer scopes at run time, changing the semantics of Python name lookups I'm not even sure what this would do.

The implicit function of the listcomp would attempt to LOAD_FAST 't', and upon finding that it doesn't have a value for it, would go and look for the name 't' in a surrounding scope. (Probably LOAD_CLOSURE.)

2) Genexps will eagerly evaluate a lookup if it happens to be the same name as an internal iteration variable.

I think we would have to specify this more precisely. Let's say by "eagerly evaluate a lookup" you mean "include it in the function parameters with a default value being the lookup (i.e. starting in the outer scope), IOW "t=t" as I showed above.

Yes. To be technically precise, there's no default argument involved, and the call to the implicit function explicitly passes all the arguments.

The question is when we would do this. IIUC the PEP already does this if the "outer scope" is a class scope for any names that a simple static analysis shows are references to variables in the class scope.

Correct.

(I don't know exactly what this static analysis should do but it could be as simple as gathering all names that are assigned to in the class, or alternatively all names assigned to before the point where the comprehension occurs. We shouldn't be distracted by dynamic definitions like exec() although we should perhaps be aware of del.)

At the moment, it isn't aware of 'del'. The analysis is simple and 100% static: If a name is in the table of names the class uses AND it's in the table of names the comprehension uses, it gets passed as a parameter. I don't want to try to be aware of del, because of this:

class X: x = 1 if y: del x print(x) z = (q for q in x if q)

If y is true, this will eagerly look up x using the same semantics in both the print and the genexp (on construction, not when you iterate over the genexp). If y is false, it'll still eagerly look up x, and it'll still use the same semantics for print and the genexp (and it'll find an 'x' in a surrounding scope).

(The current implementation actually is a bit different from that. I'm not sure whether it's possible to do it as simply as given without an extra compilation pass. But it's close enough.)

My proposal is to extend this static analysis for certain loop control variables (any simple name assigned to in a for-clause in the comprehension), regardless of what kind of scope the outer scope is. If the outer scope is a function we already know how to do this. If it's a class we use the analysis referred to above. If the outer scope is the global scope we have to do something new. I propose to use the same simple static analysis we use for class scopes.

Furthermore I propose to only do this for the loop control variable(s) of the outermost for-clause, since that's the only place where without all this rigmarole we would have a clear difference in behavior with Python 3.7 in cases like [t for t in t]. Oh, and probably we only need to do this if that loop control variable is also used as an expression in the iterable (so we don't waste time doing any of this for e.g. [t for t in q]).

Okay. Here's something that would be doable:

If the name is written to within the comprehension, AND it is read from in the outermost iterable, it is flagged early-bind.

I'll have to try implementing that to be sure, but it should be possible I think. It would cover a lot of cases, keeping them the same as we currently have.

Since we now have once again introduced an exception for the outermost loop control variable and the outermost iterable, we can consider doing this only as a temporary measure. We could have a goal to eventually make [t for t in t] fail, and in the meantime we would deprecate it -- e.g. in 3.8 a silent deprecation, in 3.9 a noisy one, in 3.10 break it. Yes, that's a lot of new static analysis for deprecating an edge case, but it seems reasonable to want to preserve backward compatibility when breaking this edge case since it's likely not all that uncommon. Even if most occurrences are bad style written by lazy programmers, we should not break working code, if it is reasonable to expect that it's relied upon in real code.

Fair enough. So the outermost iterable remains special for a short while, with deprecation.

I'll get onto the coding side of it during my Copious Free Time, hopefully this week some time.

Here's hoping!

ChrisA



More information about the Python-Dev mailing list