[Python-Dev] Possible resolution of generator expression variable capture dilemma (original) (raw)

Phillip J. Eby pje at telecommunity.com
Wed Mar 24 10:37:37 EST 2004


At 05:37 PM 3/24/04 +1200, Greg Ewing wrote:

An advantage of this approach is that all forms of nested scope (lambda, def, etc.) would benefit, not just generator expressions. I suspect it would eradicate most of the few remaining uses for the default-argument hack, for instance (which nested scopes were supposed to do, but didn't).

Wow. I haven't spent a lot of time thinking through for possible holes, but my initial reaction to this is that it'd be a big plus. It's always counterintuitive to me that I can't define a function or lambda expression in a for loop without having to first create a function that returns a function.

But then... what about while loops? I think it'd be confusing if I changed between a for and a while loop (either direction), and the semantics of nested function definitions changed. Indeed, I think you'd need to make this work for all variables rebound inside all loops that contain a nested scope, not just a for loop's index variable.

Would this produce any incompatibilities? Right now, if you define a function inside a loop, intending to call it from outside the loop, your code doesn't work in any sane way today. If you define one that's to be called from inside the loop, it will work the same way... unless you're rebinding variables after the definition, but before the call point.

So, it does seem that there could be some code that would change its semantics. Such code would have to define a function inside a loop, and reference a variable that is rebound inside the loop, but after the function is defined. E.g.:

for x in 1,2,3: def y(): print z z = x * 2 y()

Today, this code prints 2, 4, and 6, but under the proposed approach it would presumably get an unbound local error.

So, I think the trick here would be figuring out how to specify this in such a way that it both makes sense for its intended use, while not fouling up code that works today. Reallocating cells at the top of the loop might work:

for x in 1,2,3: def y(): print z z = x * 2 def q(): print z z = x * 3 y() q()

This code will now print 3,3,6,6,9,9, and would do the same under the proposed approach. What doesn't work is invoking a previous definition after modifying a local:

for x in 1,2,3: z = x * 3 if x>1: y() def y(): print z z = x * 2

Today, this prints 6,9, but under the proposed semantics it would print 4,6.

Admittedly, I am hard-pressed to imagine an actual use case for this pattern of code execution, but if anybody's done it, their code would break.

Unfortunately, this means that even your comparatively modest proposal (only 'for' loops, and only the index variable) can have this same issue, if the loop index variable is being rebound. This latter pattern (define in one iteration, invoke in a later one) will change its meaning under such a capture scheme.



More information about the Python-Dev mailing list