[Python-ideas] [Python-Dev] minmax() function returning (minimum, maximum) tuple of a sequence (original) (raw)

Tal Einat taleinat at gmail.com
Wed Oct 13 23:54:31 CEST 2010


On Mon, Oct 11, 2010 at 10:18 PM, Tal Einat wrote:

Masklinn wrote:

On 2010-10-11, at 02:55 , Zac Burns wrote:

Unfortunately this solution seems incompatable with the implementations with for loops in min and max (EG: How do you switch functions at the right time?) So it might take some tweaking. As far as I know, there is no way to force lockstep iteration of arbitrary functions in Python. Though an argument could be made for adding coroutine capabilities to builtins and library functions taking iterables, I don't think that's on the books. As a result, this function would devolve into something along the lines of  def apply(iterable, *funcs):  return map(lambda c: c0, zip(funcs, tee(iterable, len(funcs)))) which would run out of memory on very long or nigh-infinite iterables due to tee memoizing all the content of the iterator. We recently needed exactly this -- to do several running calculations in parallel on an iterable. We avoided using co-routines and just created a RunningCalc class with a simple interface, and implemented various running calculations as sub-classes, e.g. min, max, average, variance, n-largest. This isn't very fast, but since generating the iterated values is computationally heavy, this is fast enough for our uses. Having a standard method to do this in Python, with implementations for common calculations in the stdlib, would have been nice. I wouldn't mind trying to work up a PEP for this, if there is support for the idea.

After some thought, I've found a way to make running several "running calculations" in parallel fast. Speed should be comparable to having used the non-running variants.

The method is to give each running calculation "blocks" of values instead of just one at a time. The apply_in_parallel(iterable, block_size=1000, *running_calcs) function would get blocks of values from the iterable and pass them to each running calculation separately. So RunningMax would look something like this:

class RunningMax(RunningCalc): def init(self): self.max_value = None

def feed(self, value):
    if self.max_value is None or value > self.max_value:
        self.max_value = value

def feedMultiple(self, values):
    self.feed(max(values))

feedMultiple() would have a naive default implementation in the base class.

Now this is non-trivial and can certainly be useful. Thoughts? Comments?



More information about the Python-ideas mailing list