[Python-Dev] generator expression syntax (original) (raw)

Robert Mollitor mollitor at earthlink.net
Wed Mar 24 00:46:37 EST 2004


Hi,

If I may, I would like to make a comment about the generator expression syntax.

The idea of generator expressions seems good, but the proposed syntax seems a little wrong to me. It seems a little wrong to me.

First, the syntax is too dependent on the parentheses. To my mind, this is a fourth meaning for parentheses. (1) Parentheses group and associate expressions: "(a * (b + c))", "if (a and b)". (2) Parentheses construct tuples: "(1, 2, 3)", "()", "('a',)". (3) Parentheses enclose argument lists (arguably a special case of tuple-constructor): "def f(a, b, c)", "obj.dump()", "class C (A, B)". And now (4*) generator expressions: "(xx for x in list)". I realize that in some sense the parentheses are not part of the expression syntax (since we wouldn't require "sum((x * x for x in list))"), but they are mandatory nonetheless because you can't have "a = xx for x in list".
This seems like it stretching a familiar construct too far.

Second, it looks like a "tuple comprehension". The list comprehension construct yields a list. A generator expression looks like it should yield a tuple, but it doesn't. In fact, the iterator object that is returned is not even subscriptable. While

def f(arg):
    for a in arg:
        print a
f(x*x for x in (1,2,3))

will work as expected,

def g(arg):
    print arg[1:]
g(x*x for x in (1,2,3))

will not.

Third, it seems Lisp-ish or Objective-C-ish and not Pythonic to me. I realize that is just a style thing, but that's the flavor I get.

Fourth, it seems like this variable binding thing will trip people up because it is not obvious that a function is being defined. Lambdas have variable binding issues, but that is obviously a special construct. The current generator expression construct is too deceptively simple looking for its own good, in my opinion. My (admittedly weak) understanding of the variable binding issue is that the behavior of something like

a = (1,2,3)
b = (x*x for x in a)
a = (7,8,9)
for c in b:
    print c

is still up in the air. It seems that either way it is resolved will be confusing for various reasons.

OK, I not completely sure if this will work to everyone's satisfaction, but here is my proposal: replace the

(x*x for x in a)

construct with

lambda: yield x*x for x in a

CONS

PROS

gen_for ))

or

    lambdef: 'lambda' ( varargslist ':' test | ':' ( test | 'yield' test 

[gen_for] ))

(The last variant would allow a single element generator to be specified. Maybe not terribly useful, but as useful as

def f(a): yield a

I suppose)

So here would be the recasting of some of examples in PEP 289:

sum(lambda: yield x*x for x in range(10))

d = dict (lambda: yield x, func(k) for k in keylist)

sum(lambda: yield a.myattr for a in data)

g = lambda: yield x**2 for x in range(10)
print g.next()

reduce(operator.add, lambda: yield x**2 for x in range(10))

lambda: yield for x in (1, 2, 3)   # assuming we don't use list_for 

instead of gen_for in the grammar, I guess

# Now if we use lambda behavior, then I don't think we would have free 

variable binding, so x = 0 g = lambda:yield x for c in "abc" # The 'c' variable would not be exported x = 1 print g.next() # prints 1 (current x), not 0 (captured x) x = 2 print g.next() # would it print 2 now? Obviously I don't have a firm grasp on the bytecode implementation

# I think the following would work, too
for xx in lambda: yield x*x for x in range(10):
    print xx

# If so, how's this for decadent
for i,a in lambda: yield i,list[i] for i in range(len(list)):
    print i, a

I hope this provided some food for constructive thought.

Robert Mollitor



More information about the Python-Dev mailing list