Split a sequence or generator using a predicate « Python recipes « ActiveState Code (original) (raw)

Split a sequence or generator into two iterators, each iterating over the elements that either pass or fail a predicate function.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 from collections import deque def splitby(pred, seq): trues = deque() falses = deque() iseq = iter(seq) def pull(source, pred, thisval, thisbuf, otherbuf): while 1: while thisbuf: yield thisbuf.popleft() newitem = next(source) # uncomment next line to show that source is processed only once # print "pulled", newitem if pred(newitem) == thisval: yield newitem else: otherbuf.append(newitem) true_iter = pull(iseq, pred, True, trues, falses) false_iter = pull(iseq, pred, False, falses, trues) return true_iter, false_iter

Sometimes a sequence of items must be split into 2 groups depending on the evaluation of a predicate function. Using groupby to do this requires that the sequence be sorted, then grouped. Using filter/filterfalse requires 2 passes over the input sequence, which fails if the input is actually a generator. Paul Moore suggests using tee to create a split iterator over the sequence, and then passing the tee'd iterators to filter and filterfalse - this is simpler code, but it does evaluate the predicate twice for each element in the sequence; this recipe does the evaluation once per element and pushes the value to the corresponding output match/no-match list/generator.

The splitby recipe only traverses the input sequence once, so it will work for a generator, and it will only pull from the generator as values are pulled from the two output iterators.