[Python-Dev] Iteration variables and list comprehensions (original) (raw)

Tim Peters tim.one@home.com
Wed, 30 May 2001 03:47:47 -0400


[David Beazley]

... However, I've also been shooting myself in the foot a little more than usual ... Because of this, I have frequently found myself debugging the following programming error:

If "frequently" is "a little more than usual", then it sounds like your problems in all areas are too common for us to really help you by fixing this one .

OK, I'm afraid the behavior follows from taking seriously the idea that listcomps are syntactic sugar for a specific pattern of nested loops and "if" tests. That was done to make it explainable, and the correspondence is indeed exact. The implementation already creates "invisible" names:

[repr(name) for name in globals().keys()] ["'builtins'", "'name'", "'name'", "'doc'", "'_[1]'"]

Where did "_[1]" come from? You guessed it. Look for it after the listcomp finishes and it's gone:

globals().keys() 'builtins', 'name', 'name', 'doc']

It's invisible because it's a temp var you wouldn't see in the equivalent loop nest.

... Therefore, I'm wondering if it would make any sense to make the iterator variables used inside of a list comprehension private in some manner

I'm not sure it's worth losing the exact correspondence with nested loops; or that it's not worth it either. Note that "the iterator variables" needn't be bare names:

class x: ... pass ... [1 for x.i in range(3)] [1, 1, 1] x.i 2

This complicates explaining exactly how you want to deviate from the for-loop model. So, I think, does this:

[i for i in range(2) for i in range(2, 5)] [2, 3, 4, 2, 3, 4]

That is, even in simple cases, is the desired scope attached to the "for" or to the "[]"? Python doesn't have a problem with reusing a name as a for target in nested loops (or in listcomps today).

... Just as an aside, I have never intentionally used the iterator variable of a list comprehension after the operation has completed.

Not even in a debugger, when the operation has completed via unexpected exception, and you're desperate to know what the control vrbl was bound to at the time of death? Or in an exception handler?

import sys try: ... [i*i for i in xrange(sys.maxint)] ... except OverflowError: ... raise OverflowError("oops! blew up at %d" % i) ... Traceback (most recent call last): File "", line 4, in ? OverflowError: oops! blew up at 46341

Or what about:

i = 12 def f(): print i return [i for i in range(i)] f()

  1. Should "print i" print 12, or raise UnboundLocalError?

  2. Does the "i" in "range(i)" refer to the global i, or is that just senseless?

So long as the for-loop model is followed faithfully, nothing is hard to explain or predict, and simply because there's nothing truly new.

I was actually quite surprised with this behavior the first time I saw it.

Me too .

I suspect most other programmers would not anticipate this side effect either.

I share the suspicion, but am not sure why: "for" is a binding construct in Python, so being surprised by "for" binding a name is itself surprising.

Another principled model is possible, where

[f(i) for i in whatever]

is treated like

(lambda: [f(i) for i in whatever])()

i = 12 (lambda: [i**2 for i in range(4)])() [0, 1, 4, 9] i 12

That's more like Haskell does it. But the day we explain a Python construct in terms of a lambda transformation is the day Guido kills all of us .