Issue 13595: Weird behavior with generators with self-referencing output. (original) (raw)
The following self-referencing generator has incorrect output:
def ab_combinations(): #'', 'a', 'b', 'aa', 'ab', 'ba', 'bb', 'aaa', ... def _deferred_output(): yield "" tees = tee(output)
#This definition works fine: '', 'a', 'b', 'aa', 'ab', 'ba', ...
l = [(item+"a" for item in tees[0]), (item+"b" for item in tees[1])]
#This definition results in: '', 'b', 'b', 'bb', 'bb', 'bb', ...
#l = [(item+label for item in t) for t, label in zip(tees,"ab")]
while True:
for g in l:
yield next(g)
result, output = tee(_deferred_output())
return result
This is expected, and is due to the late binding of the "label" variable in the "item+label" expression. Look at the example below:
l = [lambda item: item + label for label in "ab"] f1, f2 = l print f1(''), f2('') b b
For the lambda function, 'label' is a free variable, whose value is fetched from the global environment at runtime. But at the time the second time is executed, 'label' has only one value, the last one from the for loop, and both functions only see 'b'. A Generator Expression also defines a code block, and the late binding also applies.
Now to fix your code, if 'ab' can be of arbitrary length, I can't find a simpler way without yet another inline function:
l = [(lambda lbl:(item + lbl for item in t))(label) for t, label in zip(tees,"ab")]
'lbl' is still a free variable, but now 'lbl' comes from the lambda function, and there is one different function per iteration of the loop; so they are in fact distinct 'lbl' variables, and the generator expressions will correctly see different labels.