[Python-ideas] Fwd: stats module Was: minmax() function ... (original) (raw)
Masklinn masklinn at masklinn.net
Fri Oct 15 22:56:46 CEST 2010
- Previous message: [Python-ideas] Fwd: stats module Was: minmax() function ...
- Next message: [Python-ideas] stats module Was: minmax() function ...
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On 2010-10-15, at 22:01 , Raymond Hettinger wrote:
Drat. This should have gone to python-ideas. Re-sending.
Begin forwarded message:
From: Raymond Hettinger <raymond.hettinger at gmail.com> Date: October 15, 2010 1:00:16 PM PDT To: Python-Dev Dev <python-dev at python.org> Subject: Fwd: [Python-ideas] stats module Was: minmax() function ...
Hello guys. If you don't mind, I would like to hijack your thread :-) ISTM, that the minmax() idea is really just an optimization request. A single-pass minmax() is easily coded in simple, pure-python, so really the discussion is about how to remove the loop overhead (there isn't much you can do about the cost of the two compares which is where most of the time would be spent anyway). My suggestion is to aim higher. There is no reason a single pass couldn't also return min/max/len/sum and perhaps even other summary statistics like sum(x**2) so that you can compute standard deviation and variance. A few years ago, Guido and other python devvers supported a proposal I made to create a stats module, but I didn't have time to develop it. The basic idea was that python's batteries should include most of the functionality available on advanced student calculators. Another idea behind it was that we could invisibility do-the-right-thing under the hood to help users avoid numerical problems (i.e. math.fsum(s)/len(s) is a more accurate way to compute an average because it doesn't lose precision when building-up the intermediate sums). I think the creativity and energy of this group is much better directed at building a quality stats module (perhaps with some R-like capabilities). That would likely be a better use of energy than bike-shedding about ways to speed-up a trivial piece of code that is ultimately constrained by the cost of the compares per item. my-two-cents,
Raymond
I think I'd still go with composable coroutines, the kind of stuff dabeaz shows/promotes in his training sessions and stuff. Maybe with a higher-level interface making their usage easier, but they seem a perfect fit for that kind of stuff where you create arbitrary data pipes including forks and joins.
As others mentioned, generator-based coroutines in Python have to be primed (by calling next() once on them) which is kind-of a pain, but the decorator to "fix" that is easy enough to write.
- Previous message: [Python-ideas] Fwd: stats module Was: minmax() function ...
- Next message: [Python-ideas] stats module Was: minmax() function ...
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]