Issue 23695: idiom for clustering a data series into n-length groups (original) (raw)

Created on 2015-03-18 00:18 by Paddy McCarthy, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Messages (6)

msg238365 - (view)

Author: Paddy McCarthy (Paddy McCarthy)

Date: 2015-03-18 00:18

In the zip section of the documentation, e.g. https://docs.python.org/3/library/functions.html#zip There is mention of an idiom for clustering a data series into n-length groups that I seem to only come across when people are explaining how it works on blog entries such as the three mentioned here: http://www.reddit.com/r/programming/comments/2z4rv4/a_function_for_partitioning_python_arrays/cpfvwun?context=3

It is not a straight-forward bit of code and so I think it should either be explained in more detail in the documentation or removed as an idiom, or I guess it could be encapsulated in a function and added to the stdlib.

msg238369 - (view)

Author: Ethan Furman (ethan.furman) * (Python committer)

Date: 2015-03-18 01:00

I think an example should suffice:

s = [1, 2, 3, 4, 5, 6, 7, 8, 9] n = 3 zip(*[iter(s)]*n) [(1, 2, 3), (4, 5, 6), (7, 8, 9)]

msg238378 - (view)

Author: Paddy McCarthy (Paddy McCarthy)

Date: 2015-03-18 05:10

Hmmm. It seems that the problem isn't to do with the fact that it works, or how to apply it; the problem is with how it works.

Making it an idiom means that too many will use it without knowing why it works which could lead to later maintenance issues. I think a better description of how it works may be needed for the docs.

Unfortunately my description of the how at http://paddy3118.blogspot.co.uk/2012/12/that-grouping-by-zipiter-trick-explained.html was not written with the docs in mind, but you are welcome to any part or the whole, for the Python docs.

msg238453 - (view)

Author: R. David Murray (r.david.murray) * (Python committer)

Date: 2015-03-18 15:33

I think it would be both helpful and sufficient to add a gloss, perhaps something like: "this passes zip n references to the same iterator, which means zip calls that single iterator n times for each tuple it creates; zip thus outputs tuples consisting of n length chunks from the iterator s".

msg238465 - (view)

Author: Paddy McCarthy (Paddy McCarthy)

Date: 2015-03-18 19:01

I like R. David Murray's suggestion, but I am also aware of how it works and so cannot judge how it would look to the intermediate Python programmer who knows iterators and zip, but is new to this grouper; (who I think should be the target audience).

msg243064 - (view)

Author: Roundup Robot (python-dev) (Python triager)

Date: 2015-05-13 09:34

New changeset f7d82e40e472 by Raymond Hettinger in branch 'default': Issue #23695: Explain the zip() example for clustering a data series into n-length groups. https://hg.python.org/cpython/rev/f7d82e40e472

History

Date

User

Action

Args

2022-04-11 14:58:14

admin

set

github: 67883

2015-05-13 09:35:03

rhettinger

set

status: open -> closed
resolution: fixed

2015-05-13 09:34:48

python-dev

set

nosy: + python-dev
messages: +

2015-03-18 19:01:54

Paddy McCarthy

set

messages: +

2015-03-18 15:33:52

r.david.murray

set

nosy: + r.david.murray
messages: +

2015-03-18 05:10:00

Paddy McCarthy

set

messages: +

2015-03-18 04:03:30

rhettinger

set

assignee: docs@python -> rhettinger

nosy: + rhettinger

2015-03-18 01:00:03

ethan.furman

set

nosy: + ethan.furman

messages: +
versions: - Python 3.2, Python 3.3

2015-03-18 00🔞02

Paddy McCarthy

create