[Python-Dev] accumulator display syntax (original) (raw)

Tim Peters tim_one at email.msn.com
Wed Oct 22 21🔞42 EDT 2003


I had a large file today, and needed to find lines matching several patterns simultaneously. It seemed a natural application for generator expressions, so let's see how that looks.

Generalized a bit:

Given: "source", an iterable producing elements (like a file producing lines) "predicates", a sequence of one-argument functions, mapping element to truth (like a regexp search returning a match object or None)

Create: a generator producing the elements of source for which each predicate is true

This is-- or should be --an easy application for pipelining generator expressions. Like so:

pipe = source
for p in predicates:
    # add a filter over the current pipe, and call that the new pipe
    pipe = e for e in pipe if p(e)

Now I hope that

for e in pipe:
    print e

prints the desired elements. If will if the "p" and "pipe" in the generator expression use the bindings in effect at the time the generator expression is assigned to pipe. If the generator expression is instead a closure, it's a subtle disaster. You can play with this today like so:

pipe = source
for p in predicates:
    # pipe = e for e in pipe if p(e)
    def g(pipe=pipe, p=p):
        for e in pipe:
            if p(e):
                yield e
    pipe = g()

for e in pipe:
    print e

Those are the semantics for which "it works".

If "p=p" is removed (so that the implementation of the generator expression acts like a closure wrt p), the effect is to ignore all but the last predicate. Instead predicates[-1] is applied to soucre, and then applied redundantly to the survivors len(predicates)-1 times each. It's not obvious then that the result is wrong, and for some inputs may even be correct.

If "pipe=pipe" is removed instead, it should produce a "generator already executing" exception, since the "pipe" in the final for-loop is bound to the same object as the "pipe" inside g then (all of the g's, but only the last g matters).



More information about the Python-Dev mailing list