(original) (raw)
On Tue, Aug 12, 2014 at 11:21 PM, Stephen J. Turnbull <stephen@xemacs.org> wrote:
Redirecting to python-ideas, so trimming less than I might.
reasonable enough -- you are introducing some more significant ideas for changes.
I've said all I have to say about this -- I don't seem to see anything encouraging form core devs, so I guess that's it.
Thanks for the fun bike-shedding...
-Chris
As I said, it's a regression. That's exactly the behavior in Python 3.2.
Chris Barker writes:
> On Mon, Aug 11, 2014 at 11:07 PM, Stephen J. Turnbull <stephen@xemacs.org>
> wrote:
>
> > I'm referring to removing the unnecessary information that there's a
> > better way to do it, and simply raising an error (as in Python 3.2,
> > say) which is all a RealProgrammer[tm] should ever need!
> >
>
> I can't imagine anyone is suggesting that -- disallow it, but don't tell
> anyone why?
It's only consistent if you believe that Python has strict rules for
\> The only thing that is remotely on the table here is:
\>
\> 1) remove the special case for strings -- buyer beware -- but consistent
\> and less "ugly"
use of various operators. It doesn't, except as far as they are
constrained by precedence. For example, I have an application where I
add bytestrings bytewise modulo N <= 256, and concatenate them. In
fact I use function call syntax, but the obvious operator syntax is
'+' for the bytewise addition, and '\*' for the concatenation.
It's not in the Zen, but I believe in the maxim "If it's worth doing,
it's worth doing well." So for me, 1) is out anyway.
Sure, but what about all the other immutable containers with \_\_add\_\_
\> 2) add a special case for strings that is fast and efficient -- may be as
\> simple as calling "".join() under the hood --no more code than the
\> exception check.
methods? What about mappings with key-wise \_\_add\_\_ methods whose
values might be immutable but have \_\_add\_\_ methods? Where do you stop
with the special-casing? I consider this far more complex and ugly
than the simple "sum() is for numbers" rule (and even that is way too
complex considering accuracy of summing floats).
I know that, but I think it's the wrong solution to the problem (which
\> And I doubt anyone really is pushing for anything but (2)
is genuine IMO). The right solution is something generic, possibly a
\_\_sum\_\_ method. The question is whether that leads to too much work
to be worth it (eg, "homogeneous\_iterable").
Yes and my feeling (backed up by arguments that I admit may persuade
\> > Because obviously we'd want the attractive nuisance of "if you
\> > have \_\_add\_\_, there's a default definition of \_\_sum\_\_"
\>
\> now I'm confused -- isn't that exactly what we have now?
nobody but myself) is that what we have now kinda sucks\[tm\]. It
seemed like a good idea when I first saw it, but then, my apps don't
scale to where the pain starts in my own usage.
I didn't say provide an optimized sum(), I said provide a feature
\> > It's possible that Python could provide some kind of feature that
\> > would allow an optimized sum function for every type that has
\> > \_\_add\_\_, but I think this will take a lot of thinking.
\>
\> does it need to be every type? As it is the common ones work fine already
\> except for strings -- so if we add an optimized string sum() then we're
\> done.
enabling people who want to optimize sum() to do so. So yes, it needs
to be every type (the optional \_\_sum\_\_ method is a proof of concept,
modulo it actually being implementable ;-).
Exactly. Who's arguing that the sum() we have now is a ticket to
\> > \*Somebody\* will do it (I don't think anybody is +1 on restricting
\> > sum() to a subset of types with \_\_add\_\_).
\>
\> uhm, that's exactly what we have now
Paradise? I'm just saying that there's probably somebody out there
negative enough on the current situation to come up with an answer
that I think is general enough (and I suspect that python-dev
consensus is that demanding, too).
I'd like to see that be mutable types with \_\_iadd\_\_.
\> sum() can be used for any type that has an \_\_add\_\_ defined.
Because inefficient sum() is an attractive nuisance, easy to overlook,
\> What I fail to see is why it's better to raise an exception and
\> point users to a better way, than to simply provide an optimization
\> so that it's a mute issue.
and likely to bite users other than the author.
Summing tuples works (with appropriate start=tuple()). Haven't
\> The only justification offered here is that will teach people that summing
\> strings (and some other objects?)
benchmarked, but I bet that's O(N^2).
My argument is that in practical use sum() is a bad idea, period,
\> is order(N^2) and a bad idea. But:
\>
\> a) Python's primary purpose is practical, not pedagogical (not that it
\> isn't great for that)
until you book up on the types and applications where it \*does\* work.
N.B. It doesn't even work properly for numbers (inaccurate for floats).
For people who think that special-casing strings is a good idea, I
\> b) I doubt any naive users learn anything other than "I can't use sum() for
\> strings, I should use "".join()".
think this is about as much benefit as you can expect. Why go
farther?<0.5 wink/>
TOOWTDI. str.join is in pretty much every code base by now, and
\> I submit that no naive user is going to get any closer to a proper
\> understanding of algorithmic Order behavior from this small hint. Which
\> leaves no reason to prefer an Exception to an optimization.
tutorials and FAQs recommending its user and severely deprecating sum
for strings are legion.
That assumes they know about the start argument. I think most naive
\> One other point: perhaps this will lead a naive user into thinking --
\> "sum() raises an exception if I try to use it inefficiently, so it must be
\> OK to use for anything that doesn't raise an exception" -- that would be a
\> bad lesson to mis-learn....
users will just try to sum a bunch of tuples, and get the "can't add
0, tuple" Exception and write a loop. I suspect that many of the
users who get the "use str.join" warning along with the Exception are
unaware of the start argument, too. They expect sum(iter\_of\_str) to
magically add the strings. Ie, when in 3.2 they got the
uninformative "can't add 0, str" message, they did not immediately go
"d'oh" and insert ", start=''" in the call to sum, they wrote a loop.
How do you propose to implement that, given math.fsum is perfectly
\> while we are at it, having the default sum() for floats be fsum()
\> would be nice
happy to sum integers? You can't just check one or a few leading
elements for floatiness. I think you have to dispatch on type(start),
but then sum(iter\_of\_floats) DTWT. So I would suggest changing the
signature to sum(it, start=0.0). This would probably be acceptable to
most users with iterables of ints, but does imply some performance hit.
AFAIK Python is moving in the opposite direction: if there's a common
\> This does turn sum() into a function that does type-based dispatch,
\> but isn't python full of those already? do something special for
\> the types you know about, call the generic dunder method for the
\> rest.
need for dispatching to type-specific implementations of a method,
define a standard (not "generic") dunder for the purpose, and have the
builtin (or operator, or whatever) look up (not "call") the
appropriate instance in the usual way, then call it. If there's a
useful generic implementation, define an ABC to inherit from that
provides that generic implementation.
Christopher Barker, Ph.D.
Oceanographer
Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception
Chris.Barker@noaa.gov