[Python-ideas] [Python-Dev] minmax() function returning (minimum, maximum) tuple of a sequence (original) (raw)

Steven D'Aprano steve at pearwood.info
Tue Oct 26 12:10:59 CEST 2010


Guido van Rossum wrote:

This should not require threads.

Here's a bare-bones sketch using generators: def reducecollector(func): outcome = None while True: try: val = yield except GeneratorExit: break if outcome is None: outcome = val else: outcome = func(outcome, val) raise StopIteration(outcome) def parallelreduce(iterable, funcs): collectors = [reducecollector(func) for func in funcs] values = [None for in collectors] for i, coll in enumerate(collectors): try: next(coll) except StopIteration as err: values[i] = err.args[0] collectors[i] = None for val in iterable: for i, coll in enumerate(collectors): if coll is not None: try: coll.send(val) except StopIteration as err: values[i] = err.args[0] collectors[i] = None for i, coll in enumerate(collectors): if coll is not None: try: coll.throw(GeneratorExit) except StopIteration as err: values[i] = err.args[0] return values

Perhaps I'm missing something, but to my mind, that's an awfully complicated solution for such a simple problem. Here's my attempt:

def multi_reduce(iterable, funcs): it = iter(iterable) collectors = [next(it)]*len(funcs) for i, f in enumerate(funcs): x = next(it) collectors[i] = f(collectors[i], x) return collectors

I've called it multi_reduce rather than parallel_reduce, because it doesn't execute the functions in parallel. By my testing on Python 3.1.1, multi_reduce is consistently ~30% faster than the generator based solution for lists with 1000 - 10,000,000 items.

So what am I missing? What does your parallel_reduce give us that multi_reduce doesn't?

-- Steven



More information about the Python-ideas mailing list