Proposal: Add smart chunking utility (like chunked()) to itertools or stdlib (original) (raw)

Hi All,

I’d like to propose adding a general-purpose, composable chunking utility to the Python standard library (possibly under itertools). The goal is to cover a wide range of real-world chunking needs like fixed-size grouping, sliding windows, conditional filtering, and chunk selection logic.

Project Info:

🔧 Features:

Example usage:

python

CopyEdit

from smartchunks import chunked

data = [1, 2, 3, 4, 5, 6, 7, 8, 9]

# Basic chunking
print(list(chunked(data, size=3)))
# → [[1, 2, 3], [4, 5, 6], [7, 8, 9]]

# Sliding window
print(list(chunked(data, size=3, stride=1)))
# → [[1,2,3], [2,3,4], ..., [7,8,9]]

# Advanced selection
print(list(chunked(data, size=2, nth_position=2)))
# → [[1, 3], [5, 7], [9]]

Would love to hear your feedback on whether this could belong in the standard library or potentially as an enhancement to itertools.

Thanks so much!
– Maurya Allimuthu

Im against it

the tricky cases are not an api thats sensible to use and the easy cases are easy combinations of zip iter and tee

NeilGirdhar (Neil Girdhar) June 8, 2025, 3:40pm 3

catchmaurya (Catchmaurya) June 8, 2025, 3:58pm 4

Thanks Ronny and Neil — appreciate your quick feedback :folded_hands:

@ Neil Girdhar:
Yes, I reviewed more_itertools. It’s fantastic — and I agree that chunked, windowed, and stagger cover a lot of ground.
Where smartchunks expands beyond that is in pipeline logic:

So it’s not meant to replace the basics — but to cover cases where chunk filtering and logic routing matter.

@ Ronny Pfannschmidt:
Totally fair — the easy cases should stay easy (and zip/tee are still king there).
The more expressive options are inspired by real-world usage in logs, NLP batching, error detection pipelines, etc.

That said, I’d be open to a simpler version like chunked_plus() with a few extra knobs:

Would something in that direction feel more at home in itertools or more_itertools?

NeilGirdhar (Neil Girdhar) June 8, 2025, 4:09pm 5

Just compose functions.

Compose in the other order.

Stagger does this.

Also, it’s a bit odd that since you knew about more-itertools, you would choose examples that are already handled by more-itertools.

catchmaurya (Catchmaurya) June 8, 2025, 4:24pm 6

Hi Neil and team, thanks for the pointer!

I’m absolutely aware of and appreciate how more_itertools provides core utilities like chunked, windowed, and stagger. These cover the essentials brilliantly.

What smartchunks adds on top of that:

  1. nth_position – lets you skip items before or after chunking
  2. chunk_position – lets you sample every nth chunk
  3. life of pipeline – choose the order of transformations with apply_nth_before_chunk
  4. stride + filter_fn – get overlapping chunks and filter based on custom logic
  5. Optional padding and generator/list materialization

So where more_itertools solves the building blocks, smartchunks offers a composed chunk-processing pipeline in one function—ideal for conditional workflows like log segmentation, telemetry sampling, or NLP batching.

I’m open to condensing this into a clear “chunked_plus()or simplified interface if that aligns better withitertools`. Would love to hear if that reframing makes sense!

Thanks again :folded_hands:
– Maurya

ayhanfuat (Ayhan Ç.) June 8, 2025, 4:25pm 7

itertools added batched in 3.12.