[Python-Dev] Fwd: [Python-ideas] stats module Was: minmax() function ... (original) (raw)

Raymond Hettinger raymond.hettinger at gmail.com
Fri Oct 15 22:00:16 CEST 2010


Hello guys. If you don't mind, I would like to hijack your thread :-)

ISTM, that the minmax() idea is really just an optimization request. A single-pass minmax() is easily coded in simple, pure-python, so really the discussion is about how to remove the loop overhead (there isn't much you can do about the cost of the two compares which is where most of the time would be spent anyway).

My suggestion is to aim higher. There is no reason a single pass couldn't also return min/max/len/sum and perhaps even other summary statistics like sum(x**2) so that you can compute standard deviation and variance.

A few years ago, Guido and other python devvers supported a proposal I made to create a stats module, but I didn't have time to develop it. The basic idea was that python's batteries should include most of the functionality available on advanced student calculators. Another idea behind it was that we could invisibility do-the-right-thing under the hood to help users avoid numerical problems (i.e. math.fsum(s)/len(s) is a more accurate way to compute an average because it doesn't lose precision when building-up the intermediate sums).

I think the creativity and energy of this group is much better directed at building a quality stats module (perhaps with some R-like capabilities). That would likely be a better use of energy than bike-shedding about ways to speed-up a trivial piece of code that is ultimately constrained by the cost of the compares per item.

my-two-cents,

Raymond



More information about the Python-Dev mailing list