Issue 7153: add "start" arg to max and min functions (original) (raw)
Issue7153
Created on 2009-10-16 19:24 by phr, last changed 2022-04-11 14:56 by admin. This issue is now closed.
Messages (7) | ||
---|---|---|
msg94145 - (view) | Author: paul rubin (phr) | Date: 2009-10-16 19:24 |
Lots of times I want to find the largest element of a list or sequence, defaulting to 0 if the list or sequence is empty. max(seq) throws an exception if seq is empty, so I end up using reduce(max, seq, 0). That is a standard functional programming idiom but can be a bit confusing to imperative-style Python programmers. max with multiple args is already overloaded to mean the maximum of the args, so I think it would be a good fix to add a keyword arg to accept an optional initial value: max(seq, start=0). For symmetry, min should accept the same arg. The alternatives to using reduce aren't so attractive. If seq happens to be a list, there might be a temptation to conditionalize on len(seq)==0, but that is poor style since it will break if seq later changes to an arbitrary sequence. And trying to test it by calling .next() and saving the value and/or trapping StopIteration gets awfully messy. | ||
msg94147 - (view) | Author: R. David Murray (r.david.murray) * ![]() |
Date: 2009-10-16 20:53 |
In case you don't remember it, this thread from python-ideas is relevant: http://mail.python.org/pipermail/python-ideas/2009-April/004107.html I can't tell from rereading the thread whether the support was for the 'initial' version, or the 'default' version that was more-or-less rejected as being too confusing. If the support was for the initial version then that would be support for this proposal. | ||
msg94151 - (view) | Author: paul rubin (phr) | Date: 2009-10-16 21:44 |
David, I'm not on that mailing list so hadn't seen the earlier discussion. I sympathasize with Raymond's YAGNI argument because I'm comfortable with reduce(max,seq,0); but then I remember there was once a movement to remove the "reduce" function from builtins, which would have broken that idiom. I also understand that not everyone is comfortable with that style. I recently had to hand over some code to another programmer where I had used that idiom, and in the course of adding comments to the code in preparation for the handover, I found myself writing quite a few words about why I'd used "reduce" that way, so I figured that "explicit is better than implicit" suggests adding default or initial args to the max function, just like "reduce" already has (I figure that max on a sequence is a special case of reduce). My proposed python implementation: def mymax(*args, **kwargs): if len(args) > 1: return max(*args) if len(args) == 0: raise TypeError, "mymax needs at least one positional arg" if 'initial' in kwargs: return reduce(max,args[0],kwargs['initial']) return reduce(max,args[0]) | ||
msg94164 - (view) | Author: R. David Murray (r.david.murray) * ![]() |
Date: 2009-10-17 05:01 |
Sorry, that "in case you don't remember' was directed at Raymond, not you. Since Raymond took ownership of this issue I don't think he's dismissing it (at least not yet :) I think his YAGNI was for the 'default' version, which is not what you are proposing. Note that 'reduce' is _not_ a builtin in Python3, although it is still available and so the idiom is not broken, just slightly wordier. | ||
msg94193 - (view) | Author: Raymond Hettinger (rhettinger) * ![]() |
Date: 2009-10-17 23:43 |
Here’s a summary of my research so far (including discussion with other programmers, a Google code search, discussion on #python-dev on IRC, and comparing the proposal to other APIs with start-arguments such as sum(), reduce(), and enumerate()): 1. Showed several examples to other programmers and found that they did not immediately get what the start argument was trying to do. Was a start argument the same as: reduce(max, seq, 0) # zero when empty and never less than zero max(seq) if seq else 0 # zero when empty (only works for sequences) max(chain([0], seq) # zero when empty and never less than zero 2. There is an issue of API complexity. Even if a feature is useful and clear, it may not be a good idea when the API of a function is already complex. In the case of min()/max(), we already have special handling for one argument (treated as an iterator) versus no arguments (treated an error versus multiple arguments (treated as an input sequence of values). We also have a key= argument. Taken together, the min/max functions already have a lot of features. 3.Beyond the complexity of having too many features in a function that should be simple, there is also an issue of how those features would interact: min(iterable, key=f, start=x) # is the default value x or f(x)? min(start=x) # should this be allowed? min(*args, start=x) # if so, what is behavior when len(args)==0 or 1 or 2? 4. The argument about reduce(max, seq, 0) being confusing to non-functional programmers isn’t persuasive since perfectly clear (though multi-line) exception-catching or size-checking imperative forms can be written, or one can can simply factor-out any recurring expressions if they seem awkward or confusing: def max_or_zero(iterable): 'Return zero if the iterable is empty or max(0, max(iterable)) otherwise' return functools.reduce(max, iterable, 0) 5. In the case of sequences, it can be clearer to write: max(seq) if seq else 0 max(seq + [0]) # or use itertools.chain() 6. A code search showed that max() is mostly used in a two-argument form. When it does get used with iterables, it doesn't seem common to trap ValueErrors. | ||
msg94196 - (view) | Author: paul rubin (phr) | Date: 2009-10-18 02:01 |
1. Yes, the idea is to figure out the best solution and go with it (or decide to do nothing). That many possibilities exist just points to the need for experience and wisdom in identifying the best choice ("one and preferably only one"). One of the attractive points about Python is it comes with so many such decisions already wisely made. The design wisdom embodied in Python is almost like a useful software library in its own right. So the presence of multiple choices should be seen as an invitation to resolve the issue, not a flag to keep it unresolved. 2. I agree that the multi-arg and iterator API's should have been done as separate functions (or denoted through a keyword arg), but what we have isn't too bad, and it's what we have. 3. min(iterable, key=f, start=x) should obviously default to the same thing as min([x], key=f). min(*args, start=x) should only be allowed when len(args)==1, since the start arg is restricted to the case of minimum over an iterator. min(start=x) should not be allowed. 4. I'm in general unpersuaded by the argument that a stdlib function isn't worth having because the same thing can be done by bloating up the user code. Python code should be concise, which means using the stdlib in preference to writing more user-defined functions that the next maintainer has to figure out, and which may have their own bugs and unhandled edge cases. For stdlib design, it's mostly a matter of deciding whether a construction occurs often enough to write a re-usable function for. In this case it wouldn't have occurred to me to write any of those suggested versions, instead of writing reduce(max, iterator, 0) with an explanatory comment. But even the observation that "reduce" made me bloat up the comments in the code, rather than the code itself, was enough to get me to suggest adding the initial arg. 5. I don't think it's good to rely on bool(seq) or len(seq) to be available if there's a simple construction (like here) that works for arbitrary iterables. It's better to handle the general case. I hadn't even realized til just now that "sequence" and "iterable" didn't mean the same thing. 6. Yes, I know it's not common to trap ValueErrors when using max with iterables. I wrote code without such traps myself and then got bitten by unhandled exceptions when some of those iterables turned out to be empty (hence my "reduce" hack). It wouldn't surprise me if lots more such code out there is similarly buggy. I think it's good to make bug-avoiding mechanisms obvious and convenient. | ||
msg94220 - (view) | Author: Raymond Hettinger (rhettinger) * ![]() |
Date: 2009-10-18 19:26 |
Thanks for the thoughtful reply. To give the idea another chance, today, I showed a code example to another experienced programmer. "What does this function return? max([-2, 5], start=0) " The person had a hard time making any sense of it. Perhaps the start argument was an index, or starting position, or a ceiling, or a floor. I'm rejecting the feature request for several reasons: * the code sample doesn't have obvious meaning to experienced programmers * the API of the min/max functions is already too complex -- the language needs to be easy to learn and remember * the start keyword argument doesn't interact well with the other features (such as the key argument or positional arguments). Mark questioned whether the key function would apply to the start value (any answer is arbitrary because either could work). Also, there was a question of how it would work with *args (does it change the case with zero arguments as it does for the iterator case). When the choice of semantics are arbitrary, it is better not to guess and instead have the coder explicitly say what is supposed to happen. * it isn't really needed, there are several existing ways to create the same effect. * the meaning of "start" is ambiguous and arbitrary (is it a default value for an empty input or it is like adding an extra value to the input sequence). We could pick one of the two meanings but that doesn't help a coder or code reviewer remember which meaning was correct. To prevent bugs, it is better for the code to explicitly spell-out how the corner case is to be handled. * in mathematical notation, I see the corner cases being handled by piecewise functions (if seq is empty, value is this, else compute min/max) instead of the notation trying to override the simple meaning of min/max. * I haven't found precendents in any other language. Even if there were, we would still have the problem of too many features being loaded onto Python's min/max and the unclear semantics of how those features would interact. Thank you for the feature request. Sorry, this one didn't pan out. |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-11 14:56:54 | admin | set | github: 51402 |
2009-10-18 19:26:09 | rhettinger | set | status: open -> closedresolution: rejectedmessages: + |
2009-10-18 02:01:54 | phr | set | messages: + |
2009-10-17 23:43:15 | rhettinger | set | messages: + |
2009-10-17 05:01:35 | r.david.murray | set | messages: + |
2009-10-17 04:46:03 | esam | set | nosy: + esam |
2009-10-16 21:44:11 | phr | set | messages: + |
2009-10-16 20:53:33 | r.david.murray | set | priority: normalnosy: + r.david.murraymessages: + |
2009-10-16 20:13:39 | rhettinger | set | assignee: rhettingernosy: + rhettingerversions: + Python 2.7, Python 3.2, - Python 2.5 |
2009-10-16 19:34:03 | phr | set | type: enhancement |
2009-10-16 19:24:02 | phr | create |