[Python-Dev] Accumulation module (original) (raw)

Raymond Hettinger raymond.hettinger at verizon.net
Tue Jan 13 14:51:56 EST 2004

I'm working a module with some accumulation/reduction/statistical formulas:

average(iterable): stddev(iterable, sample=False) product(iterable) nlargest(iterable, n=1) nsmallest(iterable, n=1)

The questions that have arisen so far are:

What to call the module
What else should be in it?
Is "sample" a good keyword to distinguish from population stddev?
There seems to be a speed/space choice on how to implement nlargest/nsmallest. The faster way lists out the entire iterable, heapifies it, and pops off the top n elements. The slower way is less memory intensive: build only a n-length heap and then just do a heapreplace when necessary. Note, heapq is used for both (I use operator.neg to swap between largest and smallest).
Is there a way to compute the standard deviation without multiple passes over the data (one to compute the mean and a second to sum the squares of the differences from the mean?

Raymond Hettinger