[Python-Dev] sum(...) limitation (original) (raw)

Chris Barker chris.barker at noaa.gov
Tue Aug 12 21:11:35 CEST 2014

Previous message: [Python-Dev] sum(...) limitation
Next message: [Python-Dev] sum(...) limitation
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Mon, Aug 11, 2014 at 11:07 PM, Stephen J. Turnbull <stephen at xemacs.org> wrote:

I'm referring to removing the unnecessary information that there's a better way to do it, and simply raising an error (as in Python 3.2, say) which is all a RealProgrammer[tm] should ever need!

I can't imagine anyone is suggesting that -- disallow it, but don't tell anyone why?

The only thing that is remotely on the table here is:

remove the special case for strings -- buyer beware -- but consistent and less "ugly"
add a special case for strings that is fast and efficient -- may be as simple as calling "".join() under the hood --no more code than the exception check.

And I doubt anyone really is pushing for anything but (2)

Steven Turnbull wrote:

IMO we'd also want a homogeneousiterable ABC

Actually, I've thought for years that that would open the door to a lot of optimizations -- but that's a much broader question that sum(). I even brought it up probably over ten years ago -- but no one was the least bit iinterested -- nor are they now -- I now this was a rhetorical suggestion to make the point about what not to do....

Because obviously we'd want the

attractive nuisance of "if you have add, there's a default definition of sum"

now I'm confused -- isn't that exactly what we have now?

It's possible that Python could provide some kind of feature that

would allow an optimized sum function for every type that has add, but I think this will take a lot of thinking.

does it need to be every type? As it is the common ones work fine already except for strings -- so if we add an optimized string sum() then we're done.

Somebody will do it

(I don't think anybody is +1 on restricting sum() to a subset of types with add).

uhm, that's exactly what we have now -- you can use sum() with anything that has an add, except strings. Ns by that logic, if we thought there were other inefficient use cases, we'd restrict those too.

But users can always define their own classes that have a sum and are really inefficient -- so unless sum() becomes just for a certain subset of built-in types -- does anyone want that? Then we are back to the current situation:

sum() can be used for any type that has an add defined.

But naive users are likely to try it with strings, and that's bad, so we want to prevent that, and have a special case check for strings.

What I fail to see is why it's better to raise an exception and point users to a better way, than to simply provide an optimization so that it's a mute issue.

The only justification offered here is that will teach people that summing strings (and some other objects?) is order(N^2) and a bad idea. But:

a) Python's primary purpose is practical, not pedagogical (not that it isn't great for that)

b) I doubt any naive users learn anything other than "I can't use sum() for strings, I should use "".join()". Will they make the leap to "I shouldn't use string concatenation in a loop, either"? Oh, wait, you can use string concatenation in a loop -- that's been optimized. So will they learn: "some types of object shave poor performance with repeated concatenation and shouldn't be used with sum(). So If I write such a class, and want to sum them up, I'll need to write an optimized version of that code"?

I submit that no naive user is going to get any closer to a proper understanding of algorithmic Order behavior from this small hint. Which leaves no reason to prefer an Exception to an optimization.

One other point: perhaps this will lead a naive user into thinking -- "sum() raises an exception if I try to use it inefficiently, so it must be OK to use for anything that doesn't raise an exception" -- that would be a bad lesson to mis-learn....

-Chris

PS: Armin Rigo wrote:

It also improves a lot the precision of sum(listoffloats) (though not reaching the same precision levels of math.fsum()).

while we are at it, having the default sum() for floats be fsum() would be nice -- I'd rather the default was better accuracy loser performance. Folks that really care about performance could call math.fastsum(), or really, use numpy...

This does turn sum() into a function that does type-based dispatch, but isn't python full of those already? do something special for the types you know about, call the generic dunder method for the rest.

Christopher Barker, Ph.D. Oceanographer

Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception

Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.python.org/pipermail/python-dev/attachments/20140812/8779b69b/attachment.html>

Previous message: [Python-Dev] sum(...) limitation
Next message: [Python-Dev] sum(...) limitation
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the Python-Dev mailing list