Issue 2831: Adding start to enumerate() (original) (raw)

Issue2831

Created on 2008-05-12 03:49 by scott.dial, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
enumerate.diff scott.dial,2008-05-12 03:54 patch to add start= to enumerate
Messages (14)
msg66705 - (view) Author: Scott Dial (scott.dial) Date: 2008-05-12 03:49
Georg Brandel suggested enumerate() should have the ability to start on an arbitrary number (instead of always starting at 0). I suggest such a parameter should be keyword-only. Attached is a patch to add such a feature along with added test cases. Documentation still needs to be updated, but I wasn't sure how best to handle that anyways. I wasn't sure how best to handle a keyword-only argument, so I'd be interested to know if there is a better way.
msg66709 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2008-05-12 06:19
If a start argument gets accepted, it should be positional, not a keyword-only argument. That is a complete waste when there is just one argument with a straight-forward interpretation. Besides, METH_O is a lot faster than the alternatives.
msg66710 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2008-05-12 06:23
Forget the part about METH_O. That was incorrect. Another idea to order the positional args as ([start,], iterator). That corresponds to with range([start,] stop) and it matches the output order (number, element): for i, element in enumerate(10, iterable): ^-----------------------^ ^-------------------------^
msg66711 - (view) Author: Scott Dial (scott.dial) Date: 2008-05-12 06:35
As it stands, enumerate() already takes a "sequence" keyword as an alternative to the first positional argument (although this seems to be completely undocumented). So, as you say, METH_O is a no go. I agree with you in that my original complaint with the positional argument was that enumerate(iterable, start) was "backwards." My other argument was that a large number of these iterator utility functions are foo(*iterable) and upon seeing enumerate(foo, bar), a reader might be inclined to assume it was equivalent to enumerate(chain(foo, bar)).
msg66712 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2008-05-12 07:00
FWIW, at one point, Guido rejected all variants of the idea. His first objection was that enumerate() is all about pairing values with sequence indices, so starting from anything other than zero is in conflict with the core concept. His second objection is that all variants can easily be misread as starting at the nth item in the sequence (much like islice() does now): enumerate(3, 'abcdefg') --> (3,'d') (4,'e') (5, 'f') (6, 'g'). The latter mis-reading becomes more likely for those who think of enumerate as providing indices. In fact, one of the suggested names for enumerate was "indices".
msg66776 - (view) Author: Alyssa Coghlan (ncoghlan) * (Python committer) Date: 2008-05-13 09:49
Note that this functionality is currently available as follows: >>> from itertools import count >>> list(zip(count(3), 'abcdefg') [(3, 'a'), (4, 'b'), (5, 'c'), (6, 'd'), (7, 'e'), (8, 'f'), (9, 'g')] The enumerate(itr) builtin is just a convenience to avoid a module import for the most basic zip(count(), itr) version. The proposed patch would enable the example above to be written more verbosely as: >>> list(enumerate('abcdefg', start=3)) Or, with the positional argument approach as: >>> list(enumerate(3, 'abcdefg')) So, more verbose than the existing approach, and ambiguous to boot - as Raymond noted, with the first it really isn't clear whether the first value returned would be (3, 'd') or (3, 'a'), and with the second form it isn't clear whether we're skipping the first three items, or returning only those items. Let's keep the builtins simple, and let itertools handle the variants - that's why the module exists.
msg66778 - (view) Author: Alyssa Coghlan (ncoghlan) * (Python committer) Date: 2008-05-13 10:06
Mentioning the zip(count(start), itr) version in the enumerate() docs may be a good idea though. (And of course, in 2.x, it should be izip() rather than zip() to preserve the memory efficiency of enumerate())
msg66783 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2008-05-13 14:24
> Thanks. I think this part is the main reason I see a start argument to > enumerate as potentially problematic: > > """all variants can easily be misread as starting at the nth item in the > sequence (much like islice() does now): enumerate(3, 'abcdefg') --> > (3,'d') (4,'e') (5, 'f') (6, 'g').""" So the ambiguity is that enumerate(it, start=N) could be taken as skipping the first N items of it rather than adding N to the index it returns. (And it is my own argument!) I'd like to withdraw this argument. There are two separate use cases for using enumerate(): one is to iterate over a sequence and to have a handy index by which to update the value in the sequence. Another is for 1-based counting, usually when printing 1-based ordinals (such as line numbers in files, dates in a month or months in a year, etc.). N-based counting is less common but still conceivable. However I see no use for skipping items from the start, and if that use case ever came up, passing a slice to enumerate() would be the appropriate thing to do. In fact, if you passed in a slice, you might also want to pass a corresponding start value so the indices produced match those of the original sequence. So, I am still in favor of adding a new argument to enumerate(). I'm neutral on the need for a keyword (don't think it would hurt, not sure how much it matters). I'm strongly against making it an optional *leading* argument like Raymond proposed; that's a style I just don't want to promote, range() and the curses module notwithstanding. > Is the need to use zip(count(3), seq) for the offset index case really such > a burden given the associated benefits in keeping the builtin function > really simple and easy to understand? Yes, zip(count(3), seq) is too complex for this simple use case. I've always solved this so far with this less-than-elegant but certainly simpler idiom (except for users stuck in the tradition of for-loops in certain older languages :-): for i, line in enumerat(lines): i += 1 print "%4d. %s" % (i, line) and variants thereof.
msg66789 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2008-05-13 18:34
Okay. I'm against making the argument keyword-only -- IMO keyword-only arguments really should only be used in cases where their existence has some advantage, like for max().
msg66790 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2008-05-13 18:35
Sure, fine.
msg66792 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2008-05-13 19:05
Okay, committed a matching patch in r63208. Thank you all!
msg105111 - (view) Author: George Sakkis (gsakkis) Date: 2010-05-05 23:56
Just discovered this by chance; I would probably have noticed it earlier if the docstring had been updated. Let me know if it needs a new documentation bug ticket and I'll create one. Pretty handy feature by the way, thanks for adding it!
msg105145 - (view) Author: Alyssa Coghlan (ncoghlan) * (Python committer) Date: 2010-05-06 13:16
Created issue 8635 for the incomplete docstring
msg105148 - (view) Author: Scott Dial (scott.dial) Date: 2010-05-06 13:49
Created for the broken test cases.
History
Date User Action Args
2022-04-11 14:56:34 admin set github: 47080
2010-05-06 13:49:14 scott.dial set messages: +
2010-05-06 13:16:37 ncoghlan set messages: +
2010-05-05 23:56:32 gsakkis set nosy: + gsakkismessages: +
2008-05-13 19:05:43 georg.brandl set status: open -> closedresolution: acceptedmessages: +
2008-05-13 18:35:06 gvanrossum set messages: +
2008-05-13 18:34:08 georg.brandl set nosy: + georg.brandlmessages: +
2008-05-13 14:26:33 gvanrossum set nosy: + gvanrossummessages: +
2008-05-13 10:06:40 ncoghlan set messages: +
2008-05-13 09:50:04 ncoghlan set nosy: + ncoghlanmessages: +
2008-05-12 07:00:02 rhettinger set messages: +
2008-05-12 06:35:24 scott.dial set messages: +
2008-05-12 06:23:54 rhettinger set messages: +
2008-05-12 06:19:12 rhettinger set nosy: + rhettingermessages: +
2008-05-12 03:54:36 scott.dial set files: - enumerate.diff
2008-05-12 03:54:32 scott.dial set files: + enumerate.diff
2008-05-12 03:53:21 scott.dial set files: - enumerate.diff
2008-05-12 03:53:07 scott.dial set files: + enumerate.diff
2008-05-12 03:49:33 scott.dial create