[Python-ideas] [Python-Dev] minmax() function returning (minimum, maximum) tuple of a sequence (original) (raw)
Tal Einat taleinat at gmail.com
Wed Oct 13 23:54:31 CEST 2010
- Previous message: [Python-ideas] [Python-Dev] minmax() function returning (minimum, maximum) tuple of a sequence
- Next message: [Python-ideas] [Python-Dev] minmax() function returning (minimum, maximum) tuple of a sequence
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Mon, Oct 11, 2010 at 10:18 PM, Tal Einat wrote:
Masklinn wrote:
On 2010-10-11, at 02:55 , Zac Burns wrote:
Unfortunately this solution seems incompatable with the implementations with for loops in min and max (EG: How do you switch functions at the right time?) So it might take some tweaking. As far as I know, there is no way to force lockstep iteration of arbitrary functions in Python. Though an argument could be made for adding coroutine capabilities to builtins and library functions taking iterables, I don't think that's on the books. As a result, this function would devolve into something along the lines of def apply(iterable, *funcs): return map(lambda c: c0, zip(funcs, tee(iterable, len(funcs)))) which would run out of memory on very long or nigh-infinite iterables due to tee memoizing all the content of the iterator. We recently needed exactly this -- to do several running calculations in parallel on an iterable. We avoided using co-routines and just created a RunningCalc class with a simple interface, and implemented various running calculations as sub-classes, e.g. min, max, average, variance, n-largest. This isn't very fast, but since generating the iterated values is computationally heavy, this is fast enough for our uses. Having a standard method to do this in Python, with implementations for common calculations in the stdlib, would have been nice. I wouldn't mind trying to work up a PEP for this, if there is support for the idea.
After some thought, I've found a way to make running several "running calculations" in parallel fast. Speed should be comparable to having used the non-running variants.
The method is to give each running calculation "blocks" of values instead of just one at a time. The apply_in_parallel(iterable, block_size=1000, *running_calcs) function would get blocks of values from the iterable and pass them to each running calculation separately. So RunningMax would look something like this:
class RunningMax(RunningCalc): def init(self): self.max_value = None
def feed(self, value):
if self.max_value is None or value > self.max_value:
self.max_value = value
def feedMultiple(self, values):
self.feed(max(values))
feedMultiple() would have a naive default implementation in the base class.
Now this is non-trivial and can certainly be useful. Thoughts? Comments?
- Tal Einat
- Previous message: [Python-ideas] [Python-Dev] minmax() function returning (minimum, maximum) tuple of a sequence
- Next message: [Python-ideas] [Python-Dev] minmax() function returning (minimum, maximum) tuple of a sequence
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]