API Reference - more-itertools 10.7.0 documentation (original) (raw)
More routines for operating on iterables, beyond itertools
Grouping¶
These tools yield groups of items from a source iterable.
New itertools
more_itertools.chunked(iterable, n, strict=False)[source]¶
Break iterable into lists of length n:
list(chunked([1, 2, 3, 4, 5, 6], 3)) [[1, 2, 3], [4, 5, 6]]
By the default, the last yielded list will have fewer than n elements if the length of iterable is not divisible by n:
list(chunked([1, 2, 3, 4, 5, 6, 7, 8], 3)) [[1, 2, 3], [4, 5, 6], [7, 8]]
To use a fill-in value instead, see the grouper() recipe.
If the length of iterable is not divisible by n and strict isTrue
, then ValueError
will be raised before the last list is yielded.
more_itertools.ichunked(iterable, n)[source]¶
Break iterable into sub-iterables with n elements each.ichunked() is like chunked(), but it yields iterables instead of lists.
If the sub-iterables are read in order, the elements of _iterable_won’t be stored in memory. If they are read out of order, itertools.tee()
is used to cache elements as necessary.
from itertools import count all_chunks = ichunked(count(), 4) c_1, c_2, c_3 = next(all_chunks), next(all_chunks), next(all_chunks) list(c_2) # c_1's elements have been cached; c_3's haven't been [4, 5, 6, 7] list(c_1) [0, 1, 2, 3] list(c_3) [8, 9, 10, 11]
more_itertools.chunked_even(iterable, n)[source]¶
Break iterable into lists of approximately length n. Items are distributed such the lengths of the lists differ by at most 1 item.
iterable = [1, 2, 3, 4, 5, 6, 7] n = 3 list(chunked_even(iterable, n)) # List lengths: 3, 2, 2 [[1, 2, 3], [4, 5], [6, 7]] list(chunked(iterable, n)) # List lengths: 3, 3, 1 [[1, 2, 3], [4, 5, 6], [7]]
more_itertools.sliced(seq, n, strict=False)[source]¶
Yield slices of length n from the sequence seq.
list(sliced((1, 2, 3, 4, 5, 6), 3)) [(1, 2, 3), (4, 5, 6)]
By the default, the last yielded slice will have fewer than n elements if the length of seq is not divisible by n:
list(sliced((1, 2, 3, 4, 5, 6, 7, 8), 3)) [(1, 2, 3), (4, 5, 6), (7, 8)]
If the length of seq is not divisible by n and strict isTrue
, then ValueError
will be raised before the last slice is yielded.
This function will only work for iterables that support slicing. For non-sliceable iterables, see chunked().
more_itertools.constrained_batches(iterable, max_size, max_count=None, get_len=len, strict=True)[source]¶
Yield batches of items from iterable with a combined size limited by_max_size_.
iterable = [b'12345', b'123', b'12345678', b'1', b'1', b'12', b'1'] list(constrained_batches(iterable, 10)) [(b'12345', b'123'), (b'12345678', b'1', b'1'), (b'12', b'1')]
If a max_count is supplied, the number of items per batch is also limited:
iterable = [b'12345', b'123', b'12345678', b'1', b'1', b'12', b'1'] list(constrained_batches(iterable, 10, max_count = 2)) [(b'12345', b'123'), (b'12345678', b'1'), (b'1', b'12'), (b'1',)]
If a get_len function is supplied, use that instead of len()
to determine item size.
If strict is True
, raise ValueError
if any single item is bigger than max_size. Otherwise, allow single items to exceed max_size.
more_itertools.distribute(n, iterable)[source]¶
Distribute the items from iterable among n smaller iterables.
group_1, group_2 = distribute(2, [1, 2, 3, 4, 5, 6]) list(group_1) [1, 3, 5] list(group_2) [2, 4, 6]
If the length of iterable is not evenly divisible by n, then the length of the returned iterables will not be identical:
children = distribute(3, [1, 2, 3, 4, 5, 6, 7]) [list(c) for c in children] [[1, 4, 7], [2, 5], [3, 6]]
If the length of iterable is smaller than n, then the last returned iterables will be empty:
children = distribute(5, [1, 2, 3]) [list(c) for c in children] [[1], [2], [3], [], []]
This function uses itertools.tee()
and may require significant storage.
If you need the order items in the smaller iterables to match the original iterable, see divide().
more_itertools.divide(n, iterable)[source]¶
Divide the elements from iterable into n parts, maintaining order.
group_1, group_2 = divide(2, [1, 2, 3, 4, 5, 6]) list(group_1) [1, 2, 3] list(group_2) [4, 5, 6]
If the length of iterable is not evenly divisible by n, then the length of the returned iterables will not be identical:
children = divide(3, [1, 2, 3, 4, 5, 6, 7]) [list(c) for c in children] [[1, 2, 3], [4, 5], [6, 7]]
If the length of the iterable is smaller than n, then the last returned iterables will be empty:
children = divide(5, [1, 2, 3]) [list(c) for c in children] [[1], [2], [3], [], []]
This function will exhaust the iterable before returning. If order is not important, see distribute(), which does not first pull the iterable into memory.
more_itertools.split_at(iterable, pred, maxsplit=-1, keep_separator=False)[source]¶
Yield lists of items from iterable, where each list is delimited by an item where callable pred returns True
.
list(split_at('abcdcba', lambda x: x == 'b')) [['a'], ['c', 'd', 'c'], ['a']]
list(split_at(range(10), lambda n: n % 2 == 1)) [[0], [2], [4], [6], [8], []]
At most maxsplit splits are done. If maxsplit is not specified or -1, then there is no limit on the number of splits:
list(split_at(range(10), lambda n: n % 2 == 1, maxsplit=2)) [[0], [2], [4, 5, 6, 7, 8, 9]]
By default, the delimiting items are not included in the output. To include them, set keep_separator to True
.
list(split_at('abcdcba', lambda x: x == 'b', keep_separator=True)) [['a'], ['b'], ['c', 'd', 'c'], ['b'], ['a']]
more_itertools.split_before(iterable, pred, maxsplit=-1)[source]¶
Yield lists of items from iterable, where each list ends just before an item for which callable pred returns True
:
list(split_before('OneTwo', lambda s: s.isupper())) [['O', 'n', 'e'], ['T', 'w', 'o']]
list(split_before(range(10), lambda n: n % 3 == 0)) [[0, 1, 2], [3, 4, 5], [6, 7, 8], [9]]
At most maxsplit splits are done. If maxsplit is not specified or -1, then there is no limit on the number of splits:
list(split_before(range(10), lambda n: n % 3 == 0, maxsplit=2)) [[0, 1, 2], [3, 4, 5], [6, 7, 8, 9]]
more_itertools.split_after(iterable, pred, maxsplit=-1)[source]¶
Yield lists of items from iterable, where each list ends with an item where callable pred returns True
:
list(split_after('one1two2', lambda s: s.isdigit())) [['o', 'n', 'e', '1'], ['t', 'w', 'o', '2']]
list(split_after(range(10), lambda n: n % 3 == 0)) [[0], [1, 2, 3], [4, 5, 6], [7, 8, 9]]
At most maxsplit splits are done. If maxsplit is not specified or -1, then there is no limit on the number of splits:
list(split_after(range(10), lambda n: n % 3 == 0, maxsplit=2)) [[0], [1, 2, 3], [4, 5, 6, 7, 8, 9]]
more_itertools.split_into(iterable, sizes)[source]¶
Yield a list of sequential items from iterable of length ‘n’ for each integer ‘n’ in sizes.
list(split_into([1,2,3,4,5,6], [1,2,3])) [[1], [2, 3], [4, 5, 6]]
If the sum of sizes is smaller than the length of iterable, then the remaining items of iterable will not be returned.
list(split_into([1,2,3,4,5,6], [2,3])) [[1, 2], [3, 4, 5]]
If the sum of sizes is larger than the length of iterable, fewer items will be returned in the iteration that overruns iterable and further lists will be empty:
list(split_into([1,2,3,4], [1,2,3,4])) [[1], [2, 3], [4], []]
When a None
object is encountered in sizes, the returned list will contain items up to the end of iterable the same way thatitertools.slice()
does:
list(split_into([1,2,3,4,5,6,7,8,9,0], [2,3,None])) [[1, 2], [3, 4, 5], [6, 7, 8, 9, 0]]
split_into() can be useful for grouping a series of items where the sizes of the groups are not uniform. An example would be where in a row from a table, multiple columns represent elements of the same feature (e.g. a point represented by x,y,z) but, the format is not the same for all columns.
more_itertools.split_when(iterable, pred, maxsplit=-1)[source]¶
Split iterable into pieces based on the output of pred.pred should be a function that takes successive pairs of items and returns True
if the iterable should be split in between them.
For example, to find runs of increasing numbers, split the iterable when element i
is larger than element i + 1
:
list(split_when([1, 2, 3, 3, 2, 5, 2, 4, 2], lambda x, y: x > y)) [[1, 2, 3, 3], [2, 5], [2, 4], [2]]
At most maxsplit splits are done. If maxsplit is not specified or -1, then there is no limit on the number of splits:
list(split_when([1, 2, 3, 3, 2, 5, 2, 4, 2], ... lambda x, y: x > y, maxsplit=2)) [[1, 2, 3, 3], [2, 5], [2, 4, 2]]
more_itertools.bucket(iterable, key, validator=None)[source]¶
Wrap iterable and return an object that buckets the iterable into child iterables based on a key function.
iterable = ['a1', 'b1', 'c1', 'a2', 'b2', 'c2', 'b3'] s = bucket(iterable, key=lambda x: x[0]) # Bucket by 1st character sorted(list(s)) # Get the keys ['a', 'b', 'c'] a_iterable = s['a'] next(a_iterable) 'a1' next(a_iterable) 'a2' list(s['b']) ['b1', 'b2', 'b3']
The original iterable will be advanced and its items will be cached until they are used by the child iterables. This may require significant storage.
By default, attempting to select a bucket to which no items belong will exhaust the iterable and cache all values. If you specify a validator function, selected buckets will instead be checked against it.
from itertools import count it = count(1, 2) # Infinite sequence of odd numbers key = lambda x: x % 10 # Bucket by last digit validator = lambda x: x in {1, 3, 5, 7, 9} # Odd digits only s = bucket(it, key=key, validator=validator) 2 in s False list(s[2]) []
more_itertools.unzip(iterable)[source]¶
The inverse of zip()
, this function disaggregates the elements of the zipped iterable.
The i
-th iterable contains the i
-th element from each element of the zipped iterable. The first element is used to determine the length of the remaining elements.
iterable = [('a', 1), ('b', 2), ('c', 3), ('d', 4)] letters, numbers = unzip(iterable) list(letters) ['a', 'b', 'c', 'd'] list(numbers) [1, 2, 3, 4]
This is similar to using zip(*iterable)
, but it avoids reading_iterable_ into memory. Note, however, that this function usesitertools.tee()
and thus may require significant storage.
Itertools recipes
more_itertools.batched(iterable, n, *, strict=False)[source]¶
more_itertools.grouper(iterable, n, incomplete='fill', fillvalue=None)[source]¶
Group elements from iterable into fixed-length groups of length n.
list(grouper('ABCDEF', 3)) [('A', 'B', 'C'), ('D', 'E', 'F')]
The keyword arguments incomplete and fillvalue control what happens for iterables whose length is not a multiple of n.
When incomplete is ‘fill’, the last group will contain instances of_fillvalue_.
list(grouper('ABCDEFG', 3, incomplete='fill', fillvalue='x')) [('A', 'B', 'C'), ('D', 'E', 'F'), ('G', 'x', 'x')]
When incomplete is ‘ignore’, the last group will not be emitted.
list(grouper('ABCDEFG', 3, incomplete='ignore', fillvalue='x')) [('A', 'B', 'C'), ('D', 'E', 'F')]
When incomplete is ‘strict’, a subclass of ValueError will be raised.
iterator = grouper('ABCDEFG', 3, incomplete='strict') list(iterator) Traceback (most recent call last): ... UnequalIterablesError
more_itertools.partition(pred, iterable)[source]¶
Returns a 2-tuple of iterables derived from the input iterable. The first yields the items that have pred(item) == False
. The second yields the items that have pred(item) == True
.
is_odd = lambda x: x % 2 != 0 iterable = range(10) even_items, odd_items = partition(is_odd, iterable) list(even_items), list(odd_items) ([0, 2, 4, 6, 8], [1, 3, 5, 7, 9])
If pred is None, bool()
is used.
iterable = [0, 1, False, True, '', ' '] false_items, true_items = partition(None, iterable) list(false_items), list(true_items) ([0, False, ''], [1, True, ' '])
more_itertools.transpose(it)[source]¶
Swap the rows and columns of the input matrix.
list(transpose([(1, 2, 3), (11, 22, 33)])) [(1, 11), (2, 22), (3, 33)]
The caller should ensure that the dimensions of the input are compatible. If the input is empty, no output will be produced.
Lookahead and lookback¶
These tools peek at an iterable’s values without advancing it.
New itertools
more_itertools.spy(iterable, n=1)[source]¶
Return a 2-tuple with a list containing the first n elements of_iterable_, and an iterator with the same items as iterable. This allows you to “look ahead” at the items in the iterable without advancing it.
There is one item in the list by default:
iterable = 'abcdefg' head, iterable = spy(iterable) head ['a'] list(iterable) ['a', 'b', 'c', 'd', 'e', 'f', 'g']
You may use unpacking to retrieve items instead of lists:
(head,), iterable = spy('abcdefg') head 'a' (first, second), iterable = spy('abcdefg', 2) first 'a' second 'b'
The number of items requested can be larger than the number of items in the iterable:
iterable = [1, 2, 3, 4, 5] head, iterable = spy(iterable, 10) head [1, 2, 3, 4, 5] list(iterable) [1, 2, 3, 4, 5]
class more_itertools.peekable(iterable)[source]¶
Wrap an iterator to allow lookahead and prepending elements.
Call peek()
on the result to get the value that will be returned by next()
. This won’t advance the iterator:
p = peekable(['a', 'b']) p.peek() 'a' next(p) 'a'
Pass peek()
a default value to return that instead of raisingStopIteration
when the iterator is exhausted.
p = peekable([]) p.peek('hi') 'hi'
peekables also offer a prepend() method, which “inserts” items at the head of the iterable:
p = peekable([1, 2, 3]) p.prepend(10, 11, 12) next(p) 10 p.peek() 11 list(p) [11, 12, 1, 2, 3]
peekables can be indexed. Index 0 is the item that will be returned bynext()
, index 1 is the item after that, and so on: The values up to the given index will be cached.
p = peekable(['a', 'b', 'c', 'd']) p[0] 'a' p[1] 'b' next(p) 'a'
Negative indexes are supported, but be aware that they will cache the remaining items in the source iterator, which may require significant storage.
To check whether a peekable is exhausted, check its truth value:
p = peekable(['a', 'b']) if p: # peekable has items ... list(p) ['a', 'b'] if not p: # peekable is exhausted ... list(p) []
class more_itertools.seekable(iterable, maxlen=None)[source]¶
Wrap an iterator to allow for seeking backward and forward. This progressively caches the items in the source iterable so they can be re-visited.
Call seek()
with an index to seek to that position in the source iterable.
To “reset” an iterator, seek to 0
:
from itertools import count it = seekable((str(n) for n in count())) next(it), next(it), next(it) ('0', '1', '2') it.seek(0) next(it), next(it), next(it) ('0', '1', '2')
You can also seek forward:
it = seekable((str(n) for n in range(20))) it.seek(10) next(it) '10' it.seek(20) # Seeking past the end of the source isn't a problem list(it) [] it.seek(0) # Resetting works even after hitting the end next(it) '0'
Call relative_seek()
to seek relative to the source iterator’s current position.
it = seekable((str(n) for n in range(20))) next(it), next(it), next(it) ('0', '1', '2') it.relative_seek(2) next(it) '5' it.relative_seek(-3) # Source is at '6', we move back to '3' next(it) '3' it.relative_seek(-3) # Source is at '4', we move back to '1' next(it) '1'
Call peek()
to look ahead one item without advancing the iterator:
it = seekable('1234') it.peek() '1' list(it) ['1', '2', '3', '4'] it.peek(default='empty') 'empty'
Before the iterator is at its end, calling bool()
on it will returnTrue
. After it will return False
:
it = seekable('5678') bool(it) True list(it) ['5', '6', '7', '8'] bool(it) False
You may view the contents of the cache with the elements()
method. That returns a SequenceView, a view that updates automatically:
it = seekable((str(n) for n in range(10))) next(it), next(it), next(it) ('0', '1', '2') elements = it.elements() elements SequenceView(['0', '1', '2']) next(it) '3' elements SequenceView(['0', '1', '2', '3'])
By default, the cache grows as the source iterable progresses, so beware of wrapping very large or infinite iterables. Supply maxlen to limit the size of the cache (this of course limits how far back you can seek).
from itertools import count it = seekable((str(n) for n in count()), maxlen=2) next(it), next(it), next(it), next(it) ('0', '1', '2', '3') list(it.elements()) ['2', '3'] it.seek(0) next(it), next(it), next(it), next(it) ('2', '3', '4', '5') next(it) '6'
Windowing¶
These tools yield windows of items from an iterable.
New itertools
more_itertools.windowed(seq, n, fillvalue=None, step=1)[source]¶
Return a sliding window of width n over the given iterable.
all_windows = windowed([1, 2, 3, 4, 5], 3) list(all_windows) [(1, 2, 3), (2, 3, 4), (3, 4, 5)]
When the window is larger than the iterable, fillvalue is used in place of missing values:
list(windowed([1, 2, 3], 4)) [(1, 2, 3, None)]
Each window will advance in increments of step:
list(windowed([1, 2, 3, 4, 5, 6], 3, fillvalue='!', step=2)) [(1, 2, 3), (3, 4, 5), (5, 6, '!')]
To slide into the iterable’s items, use chain()
to add filler items to the left:
iterable = [1, 2, 3, 4] n = 3 padding = [None] * (n - 1) list(windowed(chain(padding, iterable), 3)) [(None, None, 1), (None, 1, 2), (1, 2, 3), (2, 3, 4)]
more_itertools.substrings(iterable)[source]¶
Yield all of the substrings of iterable.
[''.join(s) for s in substrings('more')] ['m', 'o', 'r', 'e', 'mo', 'or', 're', 'mor', 'ore', 'more']
Note that non-string iterables can also be subdivided.
list(substrings([0, 1, 2])) [(0,), (1,), (2,), (0, 1), (1, 2), (0, 1, 2)]
more_itertools.substrings_indexes(seq, reverse=False)[source]¶
Yield all substrings and their positions in seq
The items yielded will be a tuple of the form (substr, i, j)
, wheresubstr == seq[i:j]
.
This function only works for iterables that support slicing, such asstr
objects.
for item in substrings_indexes('more'): ... print(item) ('m', 0, 1) ('o', 1, 2) ('r', 2, 3) ('e', 3, 4) ('mo', 0, 2) ('or', 1, 3) ('re', 2, 4) ('mor', 0, 3) ('ore', 1, 4) ('more', 0, 4)
Set reverse to True
to yield the same items in the opposite order.
more_itertools.stagger(iterable, offsets=(-1, 0, 1), longest=False, fillvalue=None)[source]¶
Yield tuples whose elements are offset from iterable. The amount by which the i-th item in each tuple is offset is given by the i-th item in offsets.
list(stagger([0, 1, 2, 3])) [(None, 0, 1), (0, 1, 2), (1, 2, 3)] list(stagger(range(8), offsets=(0, 2, 4))) [(0, 2, 4), (1, 3, 5), (2, 4, 6), (3, 5, 7)]
By default, the sequence will end when the final element of a tuple is the last item in the iterable. To continue until the first element of a tuple is the last item in the iterable, set longest to True
:
list(stagger([0, 1, 2, 3], longest=True)) [(None, 0, 1), (0, 1, 2), (1, 2, 3), (2, 3, None), (3, None, None)]
By default, None
will be used to replace offsets beyond the end of the sequence. Specify fillvalue to use some other value.
more_itertools.windowed_complete(iterable, n)[source]¶
Yield (beginning, middle, end)
tuples, where:
- Each
middle
has n items from iterable - Each
beginning
has the items before the ones inmiddle
- Each
end
has the items after the ones inmiddle
iterable = range(7) n = 3 for beginning, middle, end in windowed_complete(iterable, n): ... print(beginning, middle, end) () (0, 1, 2) (3, 4, 5, 6) (0,) (1, 2, 3) (4, 5, 6) (0, 1) (2, 3, 4) (5, 6) (0, 1, 2) (3, 4, 5) (6,) (0, 1, 2, 3) (4, 5, 6) ()
Note that n must be at least 0 and most equal to the length of_iterable_.
This function will exhaust the iterable and may require significant storage.
Itertools recipes
more_itertools.pairwise(iterable)[source]¶
Returns an iterator of paired items, overlapping, from the original
take(4, pairwise(count())) [(0, 1), (1, 2), (2, 3), (3, 4)]
On Python 3.10 and above, this is an alias for itertools.pairwise()
.
more_itertools.triplewise(iterable)[source]¶
Return overlapping triplets from iterable.
list(triplewise('ABCDE')) [('A', 'B', 'C'), ('B', 'C', 'D'), ('C', 'D', 'E')]
more_itertools.sliding_window(iterable, n)[source]¶
Return a sliding window of width n over iterable.
list(sliding_window(range(6), 4)) [(0, 1, 2, 3), (1, 2, 3, 4), (2, 3, 4, 5)]
If iterable has fewer than n items, then nothing is yielded:
list(sliding_window(range(3), 4)) []
For a variant with more features, see windowed().
more_itertools.subslices(iterable)[source]¶
Return all contiguous non-empty subslices of iterable.
list(subslices('ABC')) [['A'], ['A', 'B'], ['A', 'B', 'C'], ['B'], ['B', 'C'], ['C']]
This is similar to substrings(), but emits items in a different order.
Augmenting¶
These tools yield items from an iterable, plus additional data.
New itertools
more_itertools.count_cycle(iterable, n=None)[source]¶
Cycle through the items from iterable up to n times, yielding the number of completed cycles along with each item. If n is omitted the process repeats indefinitely.
list(count_cycle('AB', 3)) [(0, 'A'), (0, 'B'), (1, 'A'), (1, 'B'), (2, 'A'), (2, 'B')]
more_itertools.intersperse(e, iterable, n=1)[source]¶
Intersperse filler element e among the items in iterable, leaving_n_ items between each filler element.
list(intersperse('!', [1, 2, 3, 4, 5])) [1, '!', 2, '!', 3, '!', 4, '!', 5]
list(intersperse(None, [1, 2, 3, 4, 5], n=2)) [1, 2, None, 3, 4, None, 5]
more_itertools.padded(iterable, fillvalue=None, n=None, next_multiple=False)[source]¶
Yield the elements from iterable, followed by fillvalue, such that at least n items are emitted.
list(padded([1, 2, 3], '?', 5)) [1, 2, 3, '?', '?']
If next_multiple is True
, fillvalue will be emitted until the number of items emitted is a multiple of n:
list(padded([1, 2, 3, 4], n=3, next_multiple=True)) [1, 2, 3, 4, None, None]
If n is None
, fillvalue will be emitted indefinitely.
To create an iterable of exactly size n, you can truncate withislice()
.
list(islice(padded([1, 2, 3], '?'), 5)) [1, 2, 3, '?', '?'] list(islice(padded([1, 2, 3, 4, 5, 6, 7, 8], '?'), 5)) [1, 2, 3, 4, 5]
more_itertools.mark_ends(iterable)[source]¶
Yield 3-tuples of the form (is_first, is_last, item)
.
list(mark_ends('ABC')) [(True, False, 'A'), (False, False, 'B'), (False, True, 'C')]
Use this when looping over an iterable to take special action on its first and/or last items:
iterable = ['Header', 100, 200, 'Footer'] total = 0 for is_first, is_last, item in mark_ends(iterable): ... if is_first: ... continue # Skip the header ... if is_last: ... continue # Skip the footer ... total += item print(total) 300
more_itertools.repeat_each(iterable, n=2)[source]¶
Repeat each element in iterable n times.
list(repeat_each('ABC', 3)) ['A', 'A', 'A', 'B', 'B', 'B', 'C', 'C', 'C']
more_itertools.repeat_last(iterable, default=None)[source]¶
After the iterable is exhausted, keep yielding its last element.
list(islice(repeat_last(range(3)), 5)) [0, 1, 2, 2, 2]
If the iterable is empty, yield default forever:
list(islice(repeat_last(range(0), 42), 5)) [42, 42, 42, 42, 42]
more_itertools.adjacent(predicate, iterable, distance=1)[source]¶
Return an iterable over (bool, item) tuples where the item is drawn from iterable and the bool indicates whether that item satisfies the predicate or is adjacent to an item that does.
For example, to find whether items are adjacent to a 3
:
list(adjacent(lambda x: x == 3, range(6))) [(False, 0), (False, 1), (True, 2), (True, 3), (True, 4), (False, 5)]
Set distance to change what counts as adjacent. For example, to find whether items are two places away from a 3
:
list(adjacent(lambda x: x == 3, range(6), distance=2)) [(False, 0), (True, 1), (True, 2), (True, 3), (True, 4), (True, 5)]
This is useful for contextualizing the results of a search function. For example, a code comparison tool might want to identify lines that have changed, but also surrounding lines to give the viewer of the diff context.
The predicate function will only be called once for each item in the iterable.
See also groupby_transform(), which can be used with this function to group ranges of items with the same bool value.
more_itertools.groupby_transform(iterable, keyfunc=None, valuefunc=None, reducefunc=None)[source]¶
An extension of itertools.groupby()
that can apply transformations to the grouped data.
- keyfunc is a function computing a key value for each item in iterable
- valuefunc is a function that transforms the individual items from_iterable_ after grouping
- reducefunc is a function that transforms each group of items
iterable = 'aAAbBBcCC' keyfunc = lambda k: k.upper() valuefunc = lambda v: v.lower() reducefunc = lambda g: ''.join(g) list(groupby_transform(iterable, keyfunc, valuefunc, reducefunc)) [('A', 'aaa'), ('B', 'bbb'), ('C', 'ccc')]
Each optional argument defaults to an identity function if not specified.
groupby_transform() is useful when grouping elements of an iterable using a separate iterable as the key. To do this, zip()
the iterables and pass a keyfunc that extracts the first element and a _valuefunc_that extracts the second element:
from operator import itemgetter keys = [0, 0, 1, 1, 1, 2, 2, 2, 3] values = 'abcdefghi' iterable = zip(keys, values) grouper = groupby_transform(iterable, itemgetter(0), itemgetter(1)) [(k, ''.join(g)) for k, g in grouper] [(0, 'ab'), (1, 'cde'), (2, 'fgh'), (3, 'i')]
Note that the order of items in the iterable is significant. Only adjacent items are grouped together, so if you don’t want any duplicate groups, you should sort the iterable by the key function.
Itertools recipes
more_itertools.padnone()
more_itertools.pad_none(iterable)[source]¶
Returns the sequence of elements and then returns None
indefinitely.
take(5, pad_none(range(3))) [0, 1, 2, None, None]
Useful for emulating the behavior of the built-in map()
function.
See also padded().
more_itertools.ncycles(iterable, n)[source]¶
Returns the sequence elements n times
list(ncycles(["a", "b"], 3)) ['a', 'b', 'a', 'b', 'a', 'b']
Combining¶
These tools combine multiple iterables.
New itertools
more_itertools.collapse(iterable, base_type=None, levels=None)[source]¶
Flatten an iterable with multiple levels of nesting (e.g., a list of lists of tuples) into non-iterable types.
iterable = [(1, 2), ([3, 4], [[5], [6]])] list(collapse(iterable)) [1, 2, 3, 4, 5, 6]
Binary and text strings are not considered iterable and will not be collapsed.
To avoid collapsing other types, specify base_type:
iterable = ['ab', ('cd', 'ef'), ['gh', 'ij']] list(collapse(iterable, base_type=tuple)) ['ab', ('cd', 'ef'), 'gh', 'ij']
Specify levels to stop flattening after a certain level:
iterable = [('a', ['b']), ('c', ['d'])] list(collapse(iterable)) # Fully flattened ['a', 'b', 'c', 'd'] list(collapse(iterable, levels=1)) # Only one level flattened ['a', ['b'], 'c', ['d']]
more_itertools.interleave(*iterables)[source]¶
Return a new iterable yielding from each iterable in turn, until the shortest is exhausted.
list(interleave([1, 2, 3], [4, 5], [6, 7, 8])) [1, 4, 6, 2, 5, 7]
For a version that doesn’t terminate after the shortest iterable is exhausted, see interleave_longest().
more_itertools.interleave_longest(*iterables)[source]¶
Return a new iterable yielding from each iterable in turn, skipping any that are exhausted.
list(interleave_longest([1, 2, 3], [4, 5], [6, 7, 8])) [1, 4, 6, 2, 5, 7, 3, 8]
This function produces the same output as roundrobin(), but may perform better for some inputs (in particular when the number of iterables is large).
more_itertools.interleave_evenly(iterables, lengths=None)[source]¶
Interleave multiple iterables so that their elements are evenly distributed throughout the output sequence.
iterables = [1, 2, 3, 4, 5], ['a', 'b'] list(interleave_evenly(iterables)) [1, 2, 'a', 3, 4, 'b', 5]
iterables = [[1, 2, 3], [4, 5], [6, 7, 8]] list(interleave_evenly(iterables)) [1, 6, 4, 2, 7, 3, 8, 5]
This function requires iterables of known length. Iterables without__len__()
can be used by manually specifying lengths with lengths:
from itertools import combinations, repeat iterables = [combinations(range(4), 2), ['a', 'b', 'c']] lengths = [4 * (4 - 1) // 2, 3] list(interleave_evenly(iterables, lengths=lengths)) [(0, 1), (0, 2), 'a', (0, 3), (1, 2), 'b', (1, 3), (2, 3), 'c']
Based on Bresenham’s algorithm.
more_itertools.partial_product(*iterables)[source]¶
Yields tuples containing one item from each iterator, with subsequent tuples changing a single item at a time by advancing each iterator until it is exhausted. This sequence guarantees every value in each iterable is output at least once without generating all possible combinations.
This may be useful, for example, when testing an expensive function.
list(partial_product('AB', 'C', 'DEF')) [('A', 'C', 'D'), ('B', 'C', 'D'), ('B', 'C', 'E'), ('B', 'C', 'F')]
more_itertools.sort_together(iterables, key_list=(0,), key=None, reverse=False, strict=False)[source]¶
Return the input iterables sorted together, with key_list as the priority for sorting. All iterables are trimmed to the length of the shortest one.
This can be used like the sorting function in a spreadsheet. If each iterable represents a column of data, the key list determines which columns are used for sorting.
By default, all iterables are sorted using the 0
-th iterable:
iterables = [(4, 3, 2, 1), ('a', 'b', 'c', 'd')] sort_together(iterables) [(1, 2, 3, 4), ('d', 'c', 'b', 'a')]
Set a different key list to sort according to another iterable. Specifying multiple keys dictates how ties are broken:
iterables = [(3, 1, 2), (0, 1, 0), ('c', 'b', 'a')] sort_together(iterables, key_list=(1, 2)) [(2, 3, 1), (0, 0, 1), ('a', 'c', 'b')]
To sort by a function of the elements of the iterable, pass a _key_function. Its arguments are the elements of the iterables corresponding to the key list:
names = ('a', 'b', 'c') lengths = (1, 2, 3) widths = (5, 2, 1) def area(length, width): ... return length * width sort_together([names, lengths, widths], key_list=(1, 2), key=area) [('c', 'b', 'a'), (3, 2, 1), (1, 2, 5)]
Set reverse to True
to sort in descending order.
sort_together([(1, 2, 3), ('c', 'b', 'a')], reverse=True) [(3, 2, 1), ('a', 'b', 'c')]
If the strict keyword argument is True
, thenUnequalIterablesError
will be raised if any of the iterables have different lengths.
more_itertools.value_chain(*args)[source]¶
Yield all arguments passed to the function in the same order in which they were passed. If an argument itself is iterable then iterate over its values.
list(value_chain(1, 2, 3, [4, 5, 6])) [1, 2, 3, 4, 5, 6]
Binary and text strings are not considered iterable and are emitted as-is:
list(value_chain('12', '34', ['56', '78'])) ['12', '34', '56', '78']
Pre- or postpend a single element to an iterable:
list(value_chain(1, [2, 3, 4, 5, 6])) [1, 2, 3, 4, 5, 6] list(value_chain([1, 2, 3, 4, 5], 6)) [1, 2, 3, 4, 5, 6]
Multiple levels of nesting are not flattened.
more_itertools.zip_offset(*iterables, offsets, longest=False, fillvalue=None)[source]¶
zip
the input iterables together, but offset the i-th iterable by the i-th item in offsets.
list(zip_offset('0123', 'abcdef', offsets=(0, 1))) [('0', 'b'), ('1', 'c'), ('2', 'd'), ('3', 'e')]
This can be used as a lightweight alternative to SciPy or pandas to analyze data sets in which some series have a lead or lag relationship.
By default, the sequence will end when the shortest iterable is exhausted. To continue until the longest iterable is exhausted, set longest toTrue
.
list(zip_offset('0123', 'abcdef', offsets=(0, 1), longest=True)) [('0', 'b'), ('1', 'c'), ('2', 'd'), ('3', 'e'), (None, 'f')]
By default, None
will be used to replace offsets beyond the end of the sequence. Specify fillvalue to use some other value.
more_itertools.zip_equal(*iterables)[source]¶
zip
the input iterables together, but raiseUnequalIterablesError
if they aren’t all the same length.
it_1 = range(3) it_2 = iter('abc') list(zip_equal(it_1, it_2)) [(0, 'a'), (1, 'b'), (2, 'c')]
it_1 = range(3) it_2 = iter('abcd') list(zip_equal(it_1, it_2)) Traceback (most recent call last): ... more_itertools.more.UnequalIterablesError: Iterables have different lengths
more_itertools.zip_broadcast(*objects, scalar_types=(str, bytes), strict=False)[source]¶
A version of zip()
that “broadcasts” any scalar (i.e., non-iterable) items into output tuples.
iterable_1 = [1, 2, 3] iterable_2 = ['a', 'b', 'c'] scalar = '' list(zip_broadcast(iterable_1, iterable_2, scalar)) [(1, 'a', ''), (2, 'b', ''), (3, 'c', '')]
The scalar_types keyword argument determines what types are considered scalar. It is set to (str, bytes)
by default. Set it to None
to treat strings and byte strings as iterable:
list(zip_broadcast('abc', 0, 'xyz', scalar_types=None)) [('a', 0, 'x'), ('b', 0, 'y'), ('c', 0, 'z')]
If the strict keyword argument is True
, thenUnequalIterablesError
will be raised if any of the iterables have different lengths.
Itertools recipes
more_itertools.flatten(listOfLists)[source]¶
Return an iterator flattening one level of nesting in a list of lists.
list(flatten([[0, 1], [2, 3]])) [0, 1, 2, 3]
See also collapse(), which can flatten multiple levels of nesting.
more_itertools.roundrobin(*iterables)[source]¶
Visit input iterables in a cycle until each is exhausted.
list(roundrobin('ABC', 'D', 'EF')) ['A', 'D', 'E', 'B', 'F', 'C']
This function produces the same output as interleave_longest(), but may perform better for some inputs (in particular when the number of iterables is small).
more_itertools.prepend(value, iterator)[source]¶
Yield value, followed by the elements in iterator.
value = '0' iterator = ['1', '2', '3'] list(prepend(value, iterator)) ['0', '1', '2', '3']
To prepend multiple values, see itertools.chain()
or value_chain().
Summarizing¶
These tools return summarized or aggregated data from an iterable.
New itertools
more_itertools.ilen(iterable)[source]¶
Return the number of items in iterable.
For example, there are 168 prime numbers below 1,000:
ilen(sieve(1000)) 168
Equivalent to, but faster than:
def ilen(iterable): count = 0 for _ in iterable: count += 1 return count
This fully consumes the iterable, so handle with care.
more_itertools.unique_to_each(*iterables)[source]¶
Return the elements from each of the input iterables that aren’t in the other input iterables.
For example, suppose you have a set of packages, each with a set of dependencies:
{'pkg_1': {'A', 'B'}, 'pkg_2': {'B', 'C'}, 'pkg_3': {'B', 'D'}}
If you remove one package, which dependencies can also be removed?
If pkg_1
is removed, then A
is no longer necessary - it is not associated with pkg_2
or pkg_3
. Similarly, C
is only needed forpkg_2
, and D
is only needed for pkg_3
:
unique_to_each({'A', 'B'}, {'B', 'C'}, {'B', 'D'}) [['A'], ['C'], ['D']]
If there are duplicates in one input iterable that aren’t in the others they will be duplicated in the output. Input order is preserved:
unique_to_each("mississippi", "missouri") [['p', 'p'], ['o', 'u', 'r']]
It is assumed that the elements of each iterable are hashable.
more_itertools.sample(iterable, k=1, weights=None)[source]¶
Return a _k_-length list of elements chosen (without replacement) from the iterable. Similar to random.sample()
, but works on iterables of unknown length.
iterable = range(100) sample(iterable, 5) [81, 60, 96, 16, 4]
For iterables with repeated elements, you may supply counts to indicate the repeats.
iterable = ['a', 'b'] counts = [3, 4] # Equivalent to 'a', 'a', 'a', 'b', 'b', 'b', 'b' sample(iterable, k=3, counts=counts) ['a', 'a', 'b']
An iterable with weights may be given:
iterable = range(100) weights = (i * i + 1 for i in range(100)) sampled = sample(iterable, 5, weights=weights) [79, 67, 74, 66, 78]
Weighted selections are made without replacement. After an element is selected, it is removed from the pool and the relative weights of the other elements increase (this does not match the behavior of random.sample()
’s _counts_parameter). Note that weights may not be used with counts.
If the length of iterable is less than k,ValueError
is raised if strict is True
and all elements are returned (in shuffled order) if strict is False
.
By default, the Algorithm L reservoir sampling technique is used. When weights are provided,Algorithm A-ExpJ is used.
more_itertools.consecutive_groups(iterable, ordering=lambda x: ...)[source]¶
Yield groups of consecutive items using itertools.groupby()
. The ordering function determines whether two items are adjacent by returning their position.
By default, the ordering function is the identity function. This is suitable for finding runs of numbers:
iterable = [1, 10, 11, 12, 20, 30, 31, 32, 33, 40] for group in consecutive_groups(iterable): ... print(list(group)) [1] [10, 11, 12] [20] [30, 31, 32, 33] [40]
For finding runs of adjacent letters, try using the index()
method of a string of letters:
from string import ascii_lowercase iterable = 'abcdfgilmnop' ordering = ascii_lowercase.index for group in consecutive_groups(iterable, ordering): ... print(list(group)) ['a', 'b', 'c', 'd'] ['f', 'g'] ['i'] ['l', 'm', 'n', 'o', 'p']
Each group of consecutive items is an iterator that shares it source with_iterable_. When an an output group is advanced, the previous group is no longer available unless its elements are copied (e.g., into a list
).
iterable = [1, 2, 11, 12, 21, 22] saved_groups = [] for group in consecutive_groups(iterable): ... saved_groups.append(list(group)) # Copy group elements saved_groups [[1, 2], [11, 12], [21, 22]]
class more_itertools.run_length[source]¶
run_length.encode()
compresses an iterable with run-length encoding. It yields groups of repeated items with the count of how many times they were repeated:
uncompressed = 'abbcccdddd' list(run_length.encode(uncompressed)) [('a', 1), ('b', 2), ('c', 3), ('d', 4)]
run_length.decode()
decompresses an iterable that was previously compressed with run-length encoding. It yields the items of the decompressed iterable:
compressed = [('a', 1), ('b', 2), ('c', 3), ('d', 4)] list(run_length.decode(compressed)) ['a', 'b', 'b', 'c', 'c', 'c', 'd', 'd', 'd', 'd']
more_itertools.map_reduce(iterable, keyfunc, valuefunc=None, reducefunc=None)[source]¶
Return a dictionary that maps the items in iterable to categories defined by keyfunc, transforms them with valuefunc, and then summarizes them by category with reducefunc.
valuefunc defaults to the identity function if it is unspecified. If reducefunc is unspecified, no summarization takes place:
keyfunc = lambda x: x.upper() result = map_reduce('abbccc', keyfunc) sorted(result.items()) [('A', ['a']), ('B', ['b', 'b']), ('C', ['c', 'c', 'c'])]
Specifying valuefunc transforms the categorized items:
keyfunc = lambda x: x.upper() valuefunc = lambda x: 1 result = map_reduce('abbccc', keyfunc, valuefunc) sorted(result.items()) [('A', [1]), ('B', [1, 1]), ('C', [1, 1, 1])]
Specifying reducefunc summarizes the categorized items:
keyfunc = lambda x: x.upper() valuefunc = lambda x: 1 reducefunc = sum result = map_reduce('abbccc', keyfunc, valuefunc, reducefunc) sorted(result.items()) [('A', 1), ('B', 2), ('C', 3)]
You may want to filter the input iterable before applying the map/reduce procedure:
all_items = range(30) items = [x for x in all_items if 10 <= x <= 20] # Filter keyfunc = lambda x: x % 2 # Evens map to 0; odds to 1 categories = map_reduce(items, keyfunc=keyfunc) sorted(categories.items()) [(0, [10, 12, 14, 16, 18, 20]), (1, [11, 13, 15, 17, 19])] summaries = map_reduce(items, keyfunc=keyfunc, reducefunc=sum) sorted(summaries.items()) [(0, 90), (1, 75)]
Note that all items in the iterable are gathered into a list before the summarization step, which may require significant storage.
The returned object is a collections.defaultdict
with thedefault_factory
set to None
, such that it behaves like a normal dictionary.
more_itertools.join_mappings(**field_to_map)[source]¶
Joins multiple mappings together using their common keys.
user_scores = {'elliot': 50, 'claris': 60} user_times = {'elliot': 30, 'claris': 40} join_mappings(score=user_scores, time=user_times) {'elliot': {'score': 50, 'time': 30}, 'claris': {'score': 60, 'time': 40}}
more_itertools.exactly_n(iterable, n, predicate=bool)[source]¶
Return True
if exactly n
items in the iterable are True
according to the predicate function.
exactly_n([True, True, False], 2) True exactly_n([True, True, False], 1) False exactly_n([0, 1, 2, 3, 4, 5], 3, lambda x: x < 3) True
The iterable will be advanced until n + 1
truthy items are encountered, so avoid calling it on infinite iterables.
more_itertools.is_sorted(iterable, key=None, reverse=False, strict=False)[source]¶
Returns True
if the items of iterable are in sorted order, andFalse
otherwise. key and reverse have the same meaning that they do in the built-in sorted()
function.
is_sorted(['1', '2', '3', '4', '5'], key=int) True is_sorted([5, 4, 3, 1, 2], reverse=True) False
If strict, tests for strict sorting, that is, returns False
if equal elements are found:
is_sorted([1, 2, 2]) True is_sorted([1, 2, 2], strict=True) False
The function returns False
after encountering the first out-of-order item, which means it may produce results that differ from the built-insorted()
function for objects with unusual comparison dynamics (like math.nan
). If there are no out-of-order items, the iterable is exhausted.
more_itertools.all_unique(iterable, key=None)[source]¶
Returns True
if all the elements of iterable are unique (no two elements are equal).
all_unique('ABCB') False
If a key function is specified, it will be used to make comparisons.
all_unique('ABCb') True all_unique('ABCb', str.lower) False
The function returns as soon as the first non-unique element is encountered. Iterables with a mix of hashable and unhashable items can be used, but the function will be slower for unhashable items.
more_itertools.minmax(iterable, _*_[, key, _default_])[source]¶
more_itertools.minmax(arg1, arg2, _*args_[, _key_])[source]
Returns both the smallest and largest items in an iterable or the largest of two or more arguments.
minmax([3, 1, 5]) (1, 5)
minmax(4, 2, 6) (2, 6)
If a key function is provided, it will be used to transform the input items for comparison.
minmax([5, 30], key=str) # '30' sorts before '5' (30, 5)
If a default value is provided, it will be returned if there are no input items.
minmax([], default=(0, 0)) (0, 0)
Otherwise ValueError
is raised.
This function is based on therecipe by Raymond Hettinger and takes care to minimize the number of comparisons performed.
more_itertools.iequals(*iterables)[source]¶
Return True
if all given iterables are equal to each other, which means that they contain the same elements in the same order.
The function is useful for comparing iterables of different data types or iterables that do not support equality checks.
iequals("abc", ['a', 'b', 'c'], ('a', 'b', 'c'), iter("abc")) True
iequals("abc", "acb") False
Not to be confused with all_equal(), which checks whether all elements of iterable are equal to each other.
Itertools recipes
more_itertools.all_equal(iterable, key=None)[source]¶
Returns True
if all the elements are equal to each other.
all_equal('aaaa') True all_equal('aaab') False
A function that accepts a single argument and returns a transformed version of each input item can be specified with key:
all_equal('AaaA', key=str.casefold) True all_equal([1, 2, 3], key=lambda x: x < 10) True
more_itertools.first_true(iterable, default=None, pred=None)[source]¶
Returns the first true value in the iterable.
If no true value is found, returns default
If pred is not None, returns the first item for whichpred(item) == True
.
first_true(range(10)) 1 first_true(range(10), pred=lambda x: x > 5) 6 first_true(range(10), default='missing', pred=lambda x: x > 9) 'missing'
more_itertools.quantify(iterable, pred=bool)[source]¶
Return the how many times the predicate is true.
quantify([True, False, True]) 2
Selecting¶
These tools yield certain items from an iterable.
New itertools
class more_itertools.islice_extended(iterable, stop)[source]¶
class more_itertools.islice_extended(iterable, start, _stop_[, _step_])[source]
An extension of itertools.islice()
that supports negative values for stop, start, and step.
iterable = iter('abcdefgh') list(islice_extended(iterable, -4, -1)) ['e', 'f', 'g']
Slices with negative values require some caching of iterable, but this function takes care to minimize the amount of memory required.
For example, you can use a negative step with an infinite iterator:
from itertools import count list(islice_extended(count(), 110, 99, -2)) [110, 108, 106, 104, 102, 100]
You can also use slice notation directly:
iterable = map(str, count()) it = islice_extended(iterable)[10:20:2] list(it) ['10', '12', '14', '16', '18']
more_itertools.first(_iterable_[, _default_])[source]¶
Return the first item of iterable, or default if iterable is empty.
first([0, 1, 2, 3]) 0 first([], 'some default') 'some default'
If default is not provided and there are no items in the iterable, raise ValueError
.
first() is useful when you have a generator of expensive-to-retrieve values and want any arbitrary one. It is marginally shorter thannext(iter(iterable), default)
.
more_itertools.last(_iterable_[, _default_])[source]¶
Return the last item of iterable, or default if iterable is empty.
last([0, 1, 2, 3]) 3 last([], 'some default') 'some default'
If default is not provided and there are no items in the iterable, raise ValueError
.
more_itertools.one(iterable, too_short=ValueError, too_long=ValueError)[source]¶
Return the first item from iterable, which is expected to contain only that item. Raise an exception if iterable is empty or has more than one item.
one() is useful for ensuring that an iterable contains only one item. For example, it can be used to retrieve the result of a database query that is expected to return a single row.
If iterable is empty, ValueError
will be raised. You may specify a different exception with the too_short keyword:
it = [] one(it) Traceback (most recent call last): ... ValueError: too few items in iterable (expected 1)' too_short = IndexError('too few items') one(it, too_short=too_short) Traceback (most recent call last): ... IndexError: too few items
Similarly, if iterable contains more than one item, ValueError
will be raised. You may specify a different exception with the _too_long_keyword:
it = ['too', 'many'] one(it) Traceback (most recent call last): ... ValueError: Expected exactly one item in iterable, but got 'too', 'many', and perhaps more. too_long = RuntimeError one(it, too_long=too_long) Traceback (most recent call last): ... RuntimeError
Note that one() attempts to advance iterable twice to ensure there is only one item. See spy() or peekable() to check iterable contents less destructively.
more_itertools.only(iterable, default=None, too_long=ValueError)[source]¶
If iterable has only one item, return it. If it has zero items, return default. If it has more than one item, raise the exception given by too_long, which is ValueError
by default.
only([], default='missing') 'missing' only([1]) 1 only([1, 2]) Traceback (most recent call last): ... ValueError: Expected exactly one item in iterable, but got 1, 2, and perhaps more.' only([1, 2], too_long=TypeError) Traceback (most recent call last): ... TypeError
Note that only() attempts to advance iterable twice to ensure there is only one item. See spy() or peekable() to check iterable contents less destructively.
more_itertools.strictly_n(iterable, n, too_short=None, too_long=None)[source]¶
Validate that iterable has exactly n items and return them if it does. If it has fewer than n items, call function too_short_with those items. If it has more than n items, call function_too_long with the first n + 1
items.
iterable = ['a', 'b', 'c', 'd'] n = 4 list(strictly_n(iterable, n)) ['a', 'b', 'c', 'd']
Note that the returned iterable must be consumed in order for the check to be made.
By default, too_short and too_long are functions that raiseValueError
.
list(strictly_n('ab', 3)) Traceback (most recent call last): ... ValueError: too few items in iterable (got 2)
list(strictly_n('abc', 2)) Traceback (most recent call last): ... ValueError: too many items in iterable (got at least 3)
You can instead supply functions that do something else.too_short will be called with the number of items in iterable.too_long will be called with n + 1.
def too_short(item_count): ... raise RuntimeError it = strictly_n('abcd', 6, too_short=too_short) list(it) Traceback (most recent call last): ... RuntimeError
def too_long(item_count): ... print('The boss is going to hear about this') it = strictly_n('abcdef', 4, too_long=too_long) list(it) The boss is going to hear about this ['a', 'b', 'c', 'd']
more_itertools.strip(iterable, pred)[source]¶
Yield the items from iterable, but strip any from the beginning and end for which pred returns True
.
For example, to remove a set of items from both ends of an iterable:
iterable = (None, False, None, 1, 2, None, 3, False, None) pred = lambda x: x in {None, False, ''} list(strip(iterable, pred)) [1, 2, None, 3]
This function is analogous to str.strip()
.
more_itertools.lstrip(iterable, pred)[source]¶
Yield the items from iterable, but strip any from the beginning for which pred returns True
.
For example, to remove a set of items from the start of an iterable:
iterable = (None, False, None, 1, 2, None, 3, False, None) pred = lambda x: x in {None, False, ''} list(lstrip(iterable, pred)) [1, 2, None, 3, False, None]
This function is analogous to to str.lstrip()
, and is essentially an wrapper for itertools.dropwhile()
.
more_itertools.rstrip(iterable, pred)[source]¶
Yield the items from iterable, but strip any from the end for which pred returns True
.
For example, to remove a set of items from the end of an iterable:
iterable = (None, False, None, 1, 2, None, 3, False, None) pred = lambda x: x in {None, False, ''} list(rstrip(iterable, pred)) [None, False, None, 1, 2, None, 3]
This function is analogous to str.rstrip()
.
more_itertools.filter_except(validator, iterable, *exceptions)[source]¶
Yield the items from iterable for which the validator function does not raise one of the specified exceptions.
validator is called for each item in iterable. It should be a function that accepts one argument and raises an exception if that item is not valid.
iterable = ['1', '2', 'three', '4', None] list(filter_except(int, iterable, ValueError, TypeError)) ['1', '2', '4']
If an exception other than one given by exceptions is raised by_validator_, it is raised like normal.
more_itertools.map_except(function, iterable, *exceptions)[source]¶
Transform each item from iterable with function and yield the result, unless function raises one of the specified exceptions.
function is called to transform each item in iterable. It should accept one argument.
iterable = ['1', '2', 'three', '4', None] list(map_except(int, iterable, ValueError, TypeError)) [1, 2, 4]
If an exception other than one given by exceptions is raised by_function_, it is raised like normal.
more_itertools.filter_map(func, iterable)[source]¶
Apply func to every element of iterable, yielding only those which are not None
.
elems = ['1', 'a', '2', 'b', '3'] list(filter_map(lambda s: int(s) if s.isnumeric() else None, elems)) [1, 2, 3]
more_itertools.iter_suppress(iterable, *exceptions)[source]¶
Yield each of the items from iterable. If the iteration raises one of the specified exceptions, that exception will be suppressed and iteration will stop.
from itertools import chain def breaks_at_five(x): ... while True: ... if x >= 5: ... raise RuntimeError ... yield x ... x += 1 it_1 = iter_suppress(breaks_at_five(1), RuntimeError) it_2 = iter_suppress(breaks_at_five(2), RuntimeError) list(chain(it_1, it_2)) [1, 2, 3, 4, 2, 3, 4]
more_itertools.nth_or_last(iterable, _n_[, _default_])[source]¶
Return the nth or the last item of iterable, or default if iterable is empty.
nth_or_last([0, 1, 2, 3], 2) 2 nth_or_last([0, 1], 2) 1 nth_or_last([], 0, 'some default') 'some default'
If default is not provided and there are no items in the iterable, raise ValueError
.
more_itertools.unique_in_window(iterable, n, key=None)[source]¶
Yield the items from iterable that haven’t been seen recently.n is the size of the lookback window.
iterable = [0, 1, 0, 2, 3, 0] n = 3 list(unique_in_window(iterable, n)) [0, 1, 2, 3, 0]
The key function, if provided, will be used to determine uniqueness:
list(unique_in_window('abAcda', 3, key=lambda x: x.lower())) ['a', 'b', 'c', 'd', 'a']
The items in iterable must be hashable.
more_itertools.duplicates_everseen(iterable, key=None)[source]¶
Yield duplicate elements after their first appearance.
list(duplicates_everseen('mississippi')) ['s', 'i', 's', 's', 'i', 'p', 'i'] list(duplicates_everseen('AaaBbbCccAaa', str.lower)) ['a', 'a', 'b', 'b', 'c', 'c', 'A', 'a', 'a']
This function is analogous to unique_everseen() and is subject to the same performance considerations.
more_itertools.duplicates_justseen(iterable, key=None)[source]¶
Yields serially-duplicate elements after their first appearance.
list(duplicates_justseen('mississippi')) ['s', 's', 'p'] list(duplicates_justseen('AaaBbbCccAaa', str.lower)) ['a', 'a', 'b', 'b', 'c', 'c', 'a', 'a']
This function is analogous to unique_justseen().
more_itertools.classify_unique(iterable, key=None)[source]¶
Classify each element in terms of its uniqueness.
For each element in the input iterable, return a 3-tuple consisting of:
- The element itself
False
if the element is equal to the one preceding it in the input,True
otherwise (i.e. the equivalent of unique_justseen())False
if this element has been seen anywhere in the input before,True
otherwise (i.e. the equivalent of unique_everseen())
list(classify_unique('otto')) [('o', True, True), ('t', True, True), ('t', False, False), ('o', True, False)]
This function is analogous to unique_everseen() and is subject to the same performance considerations.
more_itertools.longest_common_prefix(iterables)[source]¶
Yield elements of the longest common prefix amongst given iterables.
''.join(longest_common_prefix(['abcd', 'abc', 'abf'])) 'ab'
more_itertools.takewhile_inclusive(predicate, iterable)[source]¶
A variant of takewhile()
that yields one additional element.
list(takewhile_inclusive(lambda x: x < 5, [1, 4, 6, 4, 1])) [1, 4, 6]
takewhile()
would return [1, 4]
.
Itertools recipes
more_itertools.nth(iterable, n, default=None)[source]¶
Returns the nth item or a default value.
l = range(10) nth(l, 3) 3 nth(l, 20, "zebra") 'zebra'
more_itertools.before_and_after(predicate, it)[source]¶
A variant of takewhile()
that allows complete access to the remainder of the iterator.
it = iter('ABCdEfGhI') all_upper, remainder = before_and_after(str.isupper, it) ''.join(all_upper) 'ABC' ''.join(remainder) # takewhile() would lose the 'd' 'dEfGhI'
Note that the first iterator must be fully consumed before the second iterator can generate valid results.
more_itertools.take(n, iterable)[source]¶
Return first n items of the iterable as a list.
take(3, range(10)) [0, 1, 2]
If there are fewer than n items in the iterable, all of them are returned.
take(10, range(3)) [0, 1, 2]
more_itertools.tail(n, iterable)[source]¶
Return an iterator over the last n items of iterable.
t = tail(3, 'ABCDEFG') list(t) ['E', 'F', 'G']
more_itertools.unique_everseen(iterable, key=None)[source]¶
Yield unique elements, preserving order.
list(unique_everseen('AAAABBBCCDAABBB')) ['A', 'B', 'C', 'D'] list(unique_everseen('ABBCcAD', str.lower)) ['A', 'B', 'C', 'D']
Sequences with a mix of hashable and unhashable items can be used. The function will be slower (i.e., O(n^2)) for unhashable items.
Remember that list
objects are unhashable - you can use the _key_parameter to transform the list to a tuple (which is hashable) to avoid a slowdown.
iterable = ([1, 2], [2, 3], [1, 2]) list(unique_everseen(iterable)) # Slow [[1, 2], [2, 3]] list(unique_everseen(iterable, key=tuple)) # Faster [[1, 2], [2, 3]]
Similarly, you may want to convert unhashable set
objects withkey=frozenset
. For dict
objects,key=lambda x: frozenset(x.items())
can be used.
more_itertools.unique_justseen(iterable, key=None)[source]¶
Yields elements in order, ignoring serial duplicates
list(unique_justseen('AAAABBBCCDAABBB')) ['A', 'B', 'C', 'D', 'A', 'B'] list(unique_justseen('ABBCcAD', str.lower)) ['A', 'B', 'C', 'A', 'D']
more_itertools.unique(iterable, key=None, reverse=False)[source]¶
Yields unique elements in sorted order.
list(unique([[1, 2], [3, 4], [1, 2]])) [[1, 2], [3, 4]]
key and reverse are passed to sorted()
.
list(unique('ABBcCAD', str.casefold)) ['A', 'B', 'c', 'D'] list(unique('ABBcCAD', str.casefold, reverse=True)) ['D', 'c', 'B', 'A']
The elements in iterable need not be hashable, but they must be comparable for sorting to work.
Combinatorics¶
These tools yield combinatorial arrangements of items from iterables.
New itertools
more_itertools.distinct_permutations(iterable, r=None)[source]¶
Yield successive distinct permutations of the elements in iterable.
sorted(distinct_permutations([1, 0, 1])) [(0, 1, 1), (1, 0, 1), (1, 1, 0)]
Equivalent to yielding from set(permutations(iterable))
, except duplicates are not generated and thrown away. For larger input sequences this is much more efficient.
Duplicate permutations arise when there are duplicated elements in the input iterable. The number of items returned isn! / (x_1! * x_2! * … * x_n!), where n is the total number of items input, and each x_i is the count of a distinct item in the input sequence. The function multinomial() computes this directly.
If r is given, only the _r_-length permutations are yielded.
sorted(distinct_permutations([1, 0, 1], r=2)) [(0, 1), (1, 0), (1, 1)] sorted(distinct_permutations(range(3), r=2)) [(0, 1), (0, 2), (1, 0), (1, 2), (2, 0), (2, 1)]
iterable need not be sortable, but note that using equal (x == y
) but non-identical (id(x) != id(y)
) elements may produce surprising behavior. For example, 1
and True
are equal but non-identical:
list(distinct_permutations([1, True, '3'])) [ (1, True, '3'), (1, '3', True), ('3', 1, True) ] list(distinct_permutations([1, 2, '3'])) [ (1, 2, '3'), (1, '3', 2), (2, 1, '3'), (2, '3', 1), ('3', 1, 2), ('3', 2, 1) ]
more_itertools.distinct_combinations(iterable, r)[source]¶
Yield the distinct combinations of r items taken from iterable.
list(distinct_combinations([0, 0, 1], 2)) [(0, 0), (0, 1)]
Equivalent to set(combinations(iterable))
, except duplicates are not generated and thrown away. For larger input sequences this is much more efficient.
more_itertools.nth_combination_with_replacement(iterable, r, index)[source]¶
Equivalent tolist(combinations_with_replacement(iterable, r))[index]
.
The subsequences with repetition of iterable that are of length r can be ordered lexicographically. nth_combination_with_replacement()computes the subsequence at sort position index directly, without computing the previous subsequences with replacement.
nth_combination_with_replacement(range(5), 3, 5) (0, 1, 1)
ValueError
will be raised If r is negative or greater than the length of iterable.IndexError
will be raised if the given index is invalid.
more_itertools.circular_shifts(iterable, steps=1)[source]¶
Yield the circular shifts of iterable.
list(circular_shifts(range(4))) [(0, 1, 2, 3), (1, 2, 3, 0), (2, 3, 0, 1), (3, 0, 1, 2)]
Set steps to the number of places to rotate to the left (or to the right if negative). Defaults to 1.
list(circular_shifts(range(4), 2)) [(0, 1, 2, 3), (2, 3, 0, 1)]
list(circular_shifts(range(4), -1)) [(0, 1, 2, 3), (3, 0, 1, 2), (2, 3, 0, 1), (1, 2, 3, 0)]
more_itertools.partitions(iterable)[source]¶
Yield all possible order-preserving partitions of iterable.
iterable = 'abc' for part in partitions(iterable): ... print([''.join(p) for p in part]) ['abc'] ['a', 'bc'] ['ab', 'c'] ['a', 'b', 'c']
This is unrelated to partition().
more_itertools.set_partitions(iterable, k=None, min_size=None, max_size=None)[source]¶
Yield the set partitions of iterable into k parts. Set partitions are not order-preserving.
iterable = 'abc' for part in set_partitions(iterable, 2): ... print([''.join(p) for p in part]) ['a', 'bc'] ['ab', 'c'] ['b', 'ac']
If k is not given, every set partition is generated.
iterable = 'abc' for part in set_partitions(iterable): ... print([''.join(p) for p in part]) ['abc'] ['a', 'bc'] ['ab', 'c'] ['b', 'ac'] ['a', 'b', 'c']
if min_size and/or max_size are given, the minimum and/or maximum size per block in partition is set.
iterable = 'abc' for part in set_partitions(iterable, min_size=2): ... print([''.join(p) for p in part]) ['abc'] for part in set_partitions(iterable, max_size=2): ... print([''.join(p) for p in part]) ['a', 'bc'] ['ab', 'c'] ['b', 'ac'] ['a', 'b', 'c']
more_itertools.product_index(element, *args)[source]¶
Equivalent to list(product(*args)).index(element)
The products of args can be ordered lexicographically.product_index() computes the first index of element without computing the previous products.
product_index([8, 2], range(10), range(5)) 42
ValueError
will be raised if the given element isn’t in the product of args.
more_itertools.combination_index(element, iterable)[source]¶
Equivalent to list(combinations(iterable, r)).index(element)
The subsequences of iterable that are of length r can be ordered lexicographically. combination_index() computes the index of the first element, without computing the previous combinations.
combination_index('adf', 'abcdefg') 10
ValueError
will be raised if the given element isn’t one of the combinations of iterable.
more_itertools.permutation_index(element, iterable)[source]¶
Equivalent to list(permutations(iterable, r)).index(element)`
The subsequences of iterable that are of length r where order is important can be ordered lexicographically. permutation_index()computes the index of the first element directly, without computing the previous permutations.
permutation_index([1, 3, 2], range(5)) 19
ValueError
will be raised if the given element isn’t one of the permutations of iterable.
more_itertools.combination_with_replacement_index(element, iterable)[source]¶
Equivalent tolist(combinations_with_replacement(iterable, r)).index(element)
The subsequences with repetition of iterable that are of length r can be ordered lexicographically. combination_with_replacement_index()computes the index of the first element, without computing the previous combinations with replacement.
combination_with_replacement_index('adf', 'abcdefg') 20
ValueError
will be raised if the given element isn’t one of the combinations with replacement of iterable.
more_itertools.gray_product(*iterables)[source]¶
Like itertools.product()
, but return tuples in an order such that only one element in the generated tuple changes from one iteration to the next.
list(gray_product('AB','CD')) [('A', 'C'), ('B', 'C'), ('B', 'D'), ('A', 'D')]
This function consumes all of the input iterables before producing output. If any of the input iterables have fewer than two items, ValueError
is raised.
For information on the algorithm, seethis sectionof Donald Knuth’s The Art of Computer Programming.
more_itertools.outer_product(func, xs, ys, *args, **kwargs)[source]¶
A generalized outer product that applies a binary function to all pairs of items. Returns a 2D matrix with len(xs)
rows and len(ys)
columns. Also accepts *args
and **kwargs
that are passed to func
.
Multiplication table:
list(outer_product(mul, range(1, 4), range(1, 6))) [(1, 2, 3, 4, 5), (2, 4, 6, 8, 10), (3, 6, 9, 12, 15)]
Cross tabulation:
xs = ['A', 'B', 'A', 'A', 'B', 'B', 'A', 'A', 'B', 'B'] ys = ['X', 'X', 'X', 'Y', 'Z', 'Z', 'Y', 'Y', 'Z', 'Z'] pair_counts = Counter(zip(xs, ys)) count_rows = lambda x, y: pair_counts[x, y] list(outer_product(count_rows, sorted(set(xs)), sorted(set(ys)))) [(2, 3, 0), (1, 0, 4)]
Usage with *args
and **kwargs
:
animals = ['cat', 'wolf', 'mouse'] list(outer_product(min, animals, animals, key=len)) [('cat', 'cat', 'cat'), ('cat', 'wolf', 'wolf'), ('cat', 'wolf', 'mouse')]
more_itertools.powerset_of_sets(iterable)[source]¶
Yields all possible subsets of the iterable.
list(powerset_of_sets([1, 2, 3])) [set(), {1}, {2}, {3}, {1, 2}, {1, 3}, {2, 3}, {1, 2, 3}] list(powerset_of_sets([1, 1, 0])) [set(), {1}, {0}, {0, 1}]
powerset_of_sets() takes care to minimize the number of hash operations performed.
Itertools recipes
more_itertools.powerset(iterable)[source]¶
Yields all possible subsets of the iterable.
list(powerset([1, 2, 3])) [(), (1,), (2,), (3,), (1, 2), (1, 3), (2, 3), (1, 2, 3)]
powerset() will operate on iterables that aren’t set
instances, so repeated elements in the input will produce repeated elements in the output.
seq = [1, 1, 0] list(powerset(seq)) [(), (1,), (1,), (0,), (1, 1), (1, 0), (1, 0), (1, 1, 0)]
For a variant that efficiently yields actual set
instances, seepowerset_of_sets().
more_itertools.random_product(*args, repeat=1)[source]¶
Draw an item at random from each of the input iterables.
random_product('abc', range(4), 'XYZ') ('c', 3, 'Z')
If repeat is provided as a keyword argument, that many items will be drawn from each iterable.
random_product('abcd', range(4), repeat=2) ('a', 2, 'd', 3)
This equivalent to taking a random selection fromitertools.product(*args, **kwarg)
.
more_itertools.random_permutation(iterable, r=None)[source]¶
Return a random r length permutation of the elements in iterable.
If r is not specified or is None
, then r defaults to the length of_iterable_.
random_permutation(range(5)) (3, 4, 0, 1, 2)
This equivalent to taking a random selection fromitertools.permutations(iterable, r)
.
more_itertools.random_combination(iterable, r)[source]¶
Return a random r length subsequence of the elements in iterable.
random_combination(range(5), 3) (2, 3, 4)
This equivalent to taking a random selection fromitertools.combinations(iterable, r)
.
more_itertools.random_combination_with_replacement(iterable, r)[source]¶
Return a random r length subsequence of elements in iterable, allowing individual elements to be repeated.
random_combination_with_replacement(range(3), 5) (0, 0, 1, 2, 2)
This equivalent to taking a random selection fromitertools.combinations_with_replacement(iterable, r)
.
more_itertools.nth_product(index, *args)[source]¶
Equivalent to list(product(*args))[index]
.
The products of args can be ordered lexicographically.nth_product() computes the product at sort position index without computing the previous products.
nth_product(8, range(2), range(2), range(2), range(2)) (1, 0, 0, 0)
IndexError
will be raised if the given index is invalid.
more_itertools.nth_permutation(iterable, r, index)[source]¶
Equivalent to list(permutations(iterable, r))[index]`
The subsequences of iterable that are of length r where order is important can be ordered lexicographically. nth_permutation()computes the subsequence at sort position index directly, without computing the previous subsequences.
nth_permutation('ghijk', 2, 5) ('h', 'i')
ValueError
will be raised If r is negative or greater than the length of iterable.IndexError
will be raised if the given index is invalid.
more_itertools.nth_combination(iterable, r, index)[source]¶
Equivalent to list(combinations(iterable, r))[index]
.
The subsequences of iterable that are of length r can be ordered lexicographically. nth_combination() computes the subsequence at sort position index directly, without computing the previous subsequences.
nth_combination(range(5), 3, 5) (0, 3, 4)
ValueError
will be raised If r is negative or greater than the length of iterable.IndexError
will be raised if the given index is invalid.
Wrapping¶
These tools provide wrappers to smooth working with objects that produce or consume iterables.
New itertools
more_itertools.always_iterable(obj, base_type=(<class 'str'>, <class 'bytes'>))[source]¶
If obj is iterable, return an iterator over its items:
obj = (1, 2, 3) list(always_iterable(obj)) [1, 2, 3]
If obj is not iterable, return a one-item iterable containing obj:
obj = 1 list(always_iterable(obj)) [1]
If obj is None
, return an empty iterable:
obj = None list(always_iterable(None)) []
By default, binary and text strings are not considered iterable:
obj = 'foo' list(always_iterable(obj)) ['foo']
If base_type is set, objects for which isinstance(obj, base_type)
returns True
won’t be considered iterable.
obj = {'a': 1} list(always_iterable(obj)) # Iterate over the dict's keys ['a'] list(always_iterable(obj, base_type=dict)) # Treat dicts as a unit [{'a': 1}]
Set base_type to None
to avoid any special handling and treat objects Python considers iterable as iterable:
obj = 'foo' list(always_iterable(obj, base_type=None)) ['f', 'o', 'o']
more_itertools.always_reversible(iterable)[source]¶
An extension of reversed()
that supports all iterables, not just those which implement the Reversible
or Sequence
protocols.
print(*always_reversible(x for x in range(3))) 2 1 0
If the iterable is already reversible, this function returns the result of reversed()
. If the iterable is not reversible, this function will cache the remaining items in the iterable and yield them in reverse order, which may require significant storage.
more_itertools.countable(iterable)[source]¶
Wrap iterable and keep a count of how many items have been consumed.
The items_seen
attribute starts at 0
and increments as the iterable is consumed:
iterable = map(str, range(10)) it = countable(iterable) it.items_seen 0 next(it), next(it) ('0', '1') list(it) ['2', '3', '4', '5', '6', '7', '8', '9'] it.items_seen 10
more_itertools.consumer(func)[source]¶
Decorator that automatically advances a PEP-342-style “reverse iterator” to its first yield point so you don’t have to call next()
on it manually.
@consumer ... def tally(): ... i = 0 ... while True: ... print('Thing number %s is %s.' % (i, (yield))) ... i += 1 ... t = tally() t.send('red') Thing number 0 is red. t.send('fish') Thing number 1 is fish.
Without the decorator, you would have to call next(t)
beforet.send()
could be used.
more_itertools.with_iter(context_manager)[source]¶
Wrap an iterable in a with
statement, so it closes once exhausted.
For example, this will close the file when the iterator is exhausted:
upper_lines = (line.upper() for line in with_iter(open('foo')))
Any context manager which returns an iterable is a candidate forwith_iter
.
class more_itertools.callback_iter(func, callback_kwd='callback', wait_seconds=0.1)[source]¶
Convert a function that uses callbacks to an iterator.
Let func be a function that takes a callback keyword argument. For example:
def func(callback=None): ... for i, c in [(1, 'a'), (2, 'b'), (3, 'c')]: ... if callback: ... callback(i, c) ... return 4
Use with callback_iter(func)
to get an iterator over the parameters that are delivered to the callback.
with callback_iter(func) as it: ... for args, kwargs in it: ... print(args) (1, 'a') (2, 'b') (3, 'c')
The function will be called in a background thread. The done
property indicates whether it has completed execution.
If it completes successfully, its return value will be available in the result
property.
Notes:
- If the function uses some keyword argument besides
callback
, supply_callback_kwd_. - If it finished executing, but raised an exception, accessing the
result
property will raise the same exception. - If it hasn’t finished executing, accessing the
result
property from within thewith
block will raiseRuntimeError
. - If it hasn’t finished executing, accessing the
result
property from outside thewith
block will raise amore_itertools.AbortThread
exception. - Provide wait_seconds to adjust how frequently the it is polled for output.
Itertools recipes
more_itertools.iter_except(func, exception, first=None)[source]¶
Yields results from a function repeatedly until an exception is raised.
Converts a call-until-exception interface to an iterator interface. Like iter(func, sentinel)
, but uses an exception instead of a sentinel to end the loop.
l = [0, 1, 2] list(iter_except(l.pop, IndexError)) [2, 1, 0]
Multiple exceptions can be specified as a stopping condition:
l = [1, 2, 3, '...', 4, 5, 6] list(iter_except(lambda: 1 + l.pop(), (IndexError, TypeError))) [7, 6, 5] list(iter_except(lambda: 1 + l.pop(), (IndexError, TypeError))) [4, 3, 2] list(iter_except(lambda: 1 + l.pop(), (IndexError, TypeError))) []
Math¶
New itertools
more_itertools.dft(xarr)[source]¶
Discrete Fourier Transform. xarr is a sequence of complex numbers. Yields the components of the corresponding transformed output vector.
import cmath xarr = [1, 2-1j, -1j, -1+2j] Xarr = [2, -2-2j, -2j, 4+4j] all(map(cmath.isclose, dft(xarr), Xarr)) True
See idft() for the inverse Discrete Fourier Transform.
more_itertools.idft(Xarr)[source]¶
Inverse Discrete Fourier Transform. Xarr is a sequence of complex numbers. Yields the components of the corresponding inverse-transformed output vector.
import cmath xarr = [1, 2-1j, -1j, -1+2j] Xarr = [2, -2-2j, -2j, 4+4j] all(map(cmath.isclose, idft(Xarr), xarr)) True
See dft() for the Discrete Fourier Transform.
more_itertools.nth_prime(n)[source]¶
Return the nth prime (counting from 0).
nth_prime(0) 2 nth_prime(100) 547
Itertools recipes
more_itertools.convolve(signal, kernel)[source]¶
Discrete linear convolution of two iterables. Equivalent to polynomial multiplication.
For example, multiplying (x² -x - 20)
by (x - 3)
gives (x³ -4x² -17x + 60)
.
list(convolve([1, -1, -20], [1, -3])) [1, -4, -17, 60]
Examples of popular kinds of kernels:
- The kernel
[0.25, 0.25, 0.25, 0.25]
computes a moving average. For image data, this blurs the image and reduces noise. - The kernel
[1/2, 0, -1/2]
estimates the first derivative of a function evaluated at evenly spaced inputs. - The kernel
[1, -2, 1]
estimates the second derivative of a function evaluated at evenly spaced inputs.
Convolutions are mathematically commutative; however, the inputs are evaluated differently. The signal is consumed lazily and can be infinite. The kernel is fully consumed before the calculations begin.
Supports all numeric types: int, float, complex, Decimal, Fraction.
References:
- Article: https://betterexplained.com/articles/intuitive-convolution/
- Video by 3Blue1Brown: https://www.youtube.com/watch?v=KuXjwB4LzSA
more_itertools.dotproduct(vec1, vec2)[source]¶
Returns the dot product of the two iterables.
dotproduct([10, 15, 12], [0.65, 0.80, 1.25]) 33.5 10 * 0.65 + 15 * 0.80 + 12 * 1.25 33.5
In Python 3.12 and later, use math.sumprod()
instead.
more_itertools.factor(n)[source]¶
Yield the prime factors of n.
list(factor(360)) [2, 2, 2, 3, 3, 5]
Finds small factors with trial division. Larger factors are either verified as prime with is_prime
or split into smaller factors with Pollard’s rho algorithm.
more_itertools.matmul(m1, m2)[source]¶
Multiply two matrices.
list(matmul([(7, 5), (3, 5)], [(2, 5), (7, 9)])) [(49, 80), (41, 60)]
The caller should ensure that the dimensions of the input matrices are compatible with each other.
Supports all numeric types: int, float, complex, Decimal, Fraction.
more_itertools.multinomial(*counts)[source]¶
Number of distinct arrangements of a multiset.
The expression multinomial(3, 4, 2)
has several equivalent interpretations:
- In the expansion of
(a + b + c)⁹
, the coefficient of thea³b⁴c²
term is 1260. - There are 1260 distinct ways to arrange 9 balls consisting of 3 reds, 4 greens, and 2 blues.
- There are 1260 unique ways to place 9 distinct objects into three bins with sizes 3, 4, and 2.
The multinomial() function computes the length ofdistinct_permutations(). For example, there are 83,160 distinct anagrams of the word “abracadabra”:
from more_itertools import distinct_permutations, ilen ilen(distinct_permutations('abracadabra')) 83160
This can be computed directly from the letter counts, 5a 2b 2r 1c 1d:
from collections import Counter list(Counter('abracadabra').values()) [5, 2, 2, 1, 1] multinomial(5, 2, 1, 1, 2) 83160
A binomial coefficient is a special case of multinomial where there are only two categories. For example, the number of ways to arrange 12 balls with 5 reds and 7 blues is multinomial(5, 7)
or math.comb(12, 5)
.
When the multiplicities are all just 1, multinomial()is a special case of math.factorial
so thatmultinomial(1, 1, 1, 1, 1, 1, 1) == math.factorial(7)
.
Reference: https://en.wikipedia.org/wiki/Multinomial_theorem
more_itertools.polynomial_from_roots(roots)[source]¶
Compute a polynomial’s coefficients from its roots.
roots = [5, -4, 3] # (x - 5) * (x + 4) * (x - 3) polynomial_from_roots(roots) # x³ - 4 x² - 17 x + 60 [1, -4, -17, 60]
Supports all numeric types: int, float, complex, Decimal, Fraction.
more_itertools.polynomial_derivative(coefficients)[source]¶
Compute the first derivative of a polynomial.
Evaluate the derivative of x³ - 4 x² - 17 x + 60
:
coefficients = [1, -4, -17, 60] derivative_coefficients = polynomial_derivative(coefficients) derivative_coefficients [3, -8, -17]
Supports all numeric types: int, float, complex, Decimal, Fraction.
more_itertools.polynomial_eval(coefficients, x)[source]¶
Evaluate a polynomial at a specific value.
Computes with better numeric stability than Horner’s method.
Evaluate x^3 - 4 * x^2 - 17 * x + 60
at x = 2.5
:
coefficients = [1, -4, -17, 60] x = 2.5 polynomial_eval(coefficients, x) 8.125
Supports all numeric types: int, float, complex, Decimal, Fraction.
more_itertools.sieve(n)[source]¶
Yield the primes less than n.
list(sieve(30)) [2, 3, 5, 7, 11, 13, 17, 19, 23, 29]
more_itertools.is_prime(n)[source]¶
Return True
if n is prime and False
otherwise.
Basic examples:
is_prime(37) True is_prime(3 * 13) False is_prime(18_446_744_073_709_551_557) True
Find the next prime over one billion:
next(filter(is_prime, count(10**9))) 1000000007
Generate random primes up to 200 bits and up to 60 decimal digits:
from random import seed, randrange, getrandbits seed(18675309)
next(filter(is_prime, map(getrandbits, repeat(200)))) 893303929355758292373272075469392561129886005037663238028407
next(filter(is_prime, map(randrange, repeat(10**60)))) 269638077304026462407872868003560484232362454342414618963649
This function is exact for values of n below 10**24. For larger inputs, the probabilistic Miller-Rabin primality test has a less than 1 in 2**128 chance of a false positive.
more_itertools.sum_of_squares(it)[source]¶
Return the sum of the squares of the input values.
sum_of_squares([10, 20, 30]) 1400
Supports all numeric types: int, float, complex, Decimal, Fraction.
more_itertools.totient(n)[source]¶
Return the count of natural numbers up to n that are coprime with n.
Euler’s totient function φ(n) gives the number of totatives. Totative are integers k in the range 1 ≤ k ≤ n such that gcd(n, k) = 1.
n = 9 totient(n) 6
totatives = [x for x in range(1, n) if gcd(n, x) == 1] totatives [1, 2, 4, 5, 7, 8] len(totatives) 6
Reference: https://en.wikipedia.org/wiki/Euler%27s_totient_function
Others¶
New itertools
more_itertools.locate(iterable, pred=bool, window_size=None)[source]¶
Yield the index of each item in iterable for which pred returnsTrue
.
pred defaults to bool()
, which will select truthy items:
list(locate([0, 1, 1, 0, 1, 0, 0])) [1, 2, 4]
Set pred to a custom function to, e.g., find the indexes for a particular item.
list(locate(['a', 'b', 'c', 'b'], lambda x: x == 'b')) [1, 3]
If window_size is given, then the pred function will be called with that many items. This enables searching for sub-sequences:
iterable = [0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3] pred = lambda *args: args == (1, 2, 3) list(locate(iterable, pred=pred, window_size=3)) [1, 5, 9]
Use with seekable() to find indexes and then retrieve the associated items:
from itertools import count from more_itertools import seekable source = (3 * n + 1 if (n % 2) else n // 2 for n in count()) it = seekable(source) pred = lambda x: x > 100 indexes = locate(it, pred=pred) i = next(indexes) it.seek(i) next(it) 106
more_itertools.rlocate(iterable, pred=bool, window_size=None)[source]¶
Yield the index of each item in iterable for which pred returnsTrue
, starting from the right and moving left.
pred defaults to bool()
, which will select truthy items:
list(rlocate([0, 1, 1, 0, 1, 0, 0])) # Truthy at 1, 2, and 4 [4, 2, 1]
Set pred to a custom function to, e.g., find the indexes for a particular item:
iterable = iter('abcb') pred = lambda x: x == 'b' list(rlocate(iterable, pred)) [3, 1]
If window_size is given, then the pred function will be called with that many items. This enables searching for sub-sequences:
iterable = [0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3] pred = lambda *args: args == (1, 2, 3) list(rlocate(iterable, pred=pred, window_size=3)) [9, 5, 1]
Beware, this function won’t return anything for infinite iterables. If iterable is reversible, rlocate
will reverse it and search from the right. Otherwise, it will search from the left and return the results in reverse order.
See locate() to for other example applications.
more_itertools.replace(iterable, pred, substitutes, count=None, window_size=1)[source]¶
Yield the items from iterable, replacing the items for which _pred_returns True
with the items from the iterable substitutes.
iterable = [1, 1, 0, 1, 1, 0, 1, 1] pred = lambda x: x == 0 substitutes = (2, 3) list(replace(iterable, pred, substitutes)) [1, 1, 2, 3, 1, 1, 2, 3, 1, 1]
If count is given, the number of replacements will be limited:
iterable = [1, 1, 0, 1, 1, 0, 1, 1, 0] pred = lambda x: x == 0 substitutes = [None] list(replace(iterable, pred, substitutes, count=2)) [1, 1, None, 1, 1, None, 1, 1, 0]
Use window_size to control the number of items passed as arguments to_pred_. This allows for locating and replacing subsequences.
iterable = [0, 1, 2, 5, 0, 1, 2, 5] window_size = 3 pred = lambda *args: args == (0, 1, 2) # 3 items passed to pred substitutes = [3, 4] # Splice in these items list(replace(iterable, pred, substitutes, window_size=window_size)) [3, 4, 5, 3, 4, 5]
more_itertools.numeric_range(stop)[source]¶
more_itertools.numeric_range(start, _stop_[, _step_])[source]
An extension of the built-in range()
function whose arguments can be any orderable numeric type.
With only stop specified, start defaults to 0
and _step_defaults to 1
. The output items will match the type of stop:
list(numeric_range(3.5)) [0.0, 1.0, 2.0, 3.0]
With only start and stop specified, step defaults to 1
. The output items will match the type of start:
from decimal import Decimal start = Decimal('2.1') stop = Decimal('5.1') list(numeric_range(start, stop)) [Decimal('2.1'), Decimal('3.1'), Decimal('4.1')]
With start, stop, and step specified the output items will match the type of start + step
:
from fractions import Fraction start = Fraction(1, 2) # Start at 1/2 stop = Fraction(5, 2) # End at 5/2 step = Fraction(1, 2) # Count by 1/2 list(numeric_range(start, stop, step)) [Fraction(1, 2), Fraction(1, 1), Fraction(3, 2), Fraction(2, 1)]
If step is zero, ValueError
is raised. Negative steps are supported:
list(numeric_range(3, -1, -1.0)) [3.0, 2.0, 1.0, 0.0]
Be aware of the limitations of floating point numbers; the representation of the yielded numbers may be surprising.
datetime.datetime
objects can be used for start and stop, if _step_is a datetime.timedelta
object:
import datetime start = datetime.datetime(2019, 1, 1) stop = datetime.datetime(2019, 1, 3) step = datetime.timedelta(days=1) items = iter(numeric_range(start, stop, step)) next(items) datetime.datetime(2019, 1, 1, 0, 0) next(items) datetime.datetime(2019, 1, 2, 0, 0)
more_itertools.side_effect(func, iterable, chunk_size=None, before=None, after=None)[source]¶
Invoke func on each item in iterable (or on each chunk_size group of items) before yielding the item.
func must be a function that takes a single argument. Its return value will be discarded.
before and after are optional functions that take no arguments. They will be executed before iteration starts and after it ends, respectively.
side_effect can be used for logging, updating progress bars, or anything that is not functionally “pure.”
Emitting a status message:
from more_itertools import consume func = lambda item: print('Received {}'.format(item)) consume(side_effect(func, range(2))) Received 0 Received 1
Operating on chunks of items:
pair_sums = [] func = lambda chunk: pair_sums.append(sum(chunk)) list(side_effect(func, [0, 1, 2, 3, 4, 5], 2)) [0, 1, 2, 3, 4, 5] list(pair_sums) [1, 5, 9]
Writing to a file-like object:
from io import StringIO from more_itertools import consume f = StringIO() func = lambda x: print(x, file=f) before = lambda: print(u'HEADER', file=f) after = f.close it = [u'a', u'b', u'c'] consume(side_effect(func, it, before=before, after=after)) f.closed True
more_itertools.iterate(func, start)[source]¶
Return start
, func(start)
, func(func(start))
, …
Produces an infinite iterator. To add a stopping condition, use take(), takewhile
, or takewhile_inclusive():.
take(10, iterate(lambda x: 2*x, 1)) [1, 2, 4, 8, 16, 32, 64, 128, 256, 512]
collatz = lambda x: 3*x + 1 if x%2==1 else x // 2 list(takewhile_inclusive(lambda x: x!=1, iterate(collatz, 10))) [10, 5, 16, 8, 4, 2, 1]
more_itertools.difference(iterable, func=operator.sub, *, initial=None)[source]¶
This function is the inverse of itertools.accumulate()
. By default it will compute the first difference of iterable usingoperator.sub()
:
from itertools import accumulate iterable = accumulate([0, 1, 2, 3, 4]) # produces 0, 1, 3, 6, 10 list(difference(iterable)) [0, 1, 2, 3, 4]
func defaults to operator.sub()
, but other functions can be specified. They will be applied as follows:
A, B, C, D, ... --> A, func(B, A), func(C, B), func(D, C), ...
For example, to do progressive division:
iterable = [1, 2, 6, 24, 120] func = lambda x, y: x // y list(difference(iterable, func)) [1, 2, 3, 4, 5]
If the initial keyword is set, the first element will be skipped when computing successive differences.
it = [10, 11, 13, 16] # from accumulate([1, 2, 3], initial=10) list(difference(it, initial=10)) [1, 2, 3]
more_itertools.make_decorator(wrapping_func, result_index=0)[source]¶
Return a decorator version of wrapping_func, which is a function that modifies an iterable. result_index is the position in that function’s signature where the iterable goes.
This lets you use itertools on the “production end,” i.e. at function definition. This can augment what the function returns without changing the function’s code.
For example, to produce a decorator version of chunked():
from more_itertools import chunked chunker = make_decorator(chunked, result_index=0) @chunker(3) ... def iter_range(n): ... return iter(range(n)) ... list(iter_range(9)) [[0, 1, 2], [3, 4, 5], [6, 7, 8]]
To only allow truthy items to be returned:
truth_serum = make_decorator(filter, result_index=1) @truth_serum(bool) ... def boolean_test(): ... return [0, 1, '', ' ', False, True] ... list(boolean_test()) [1, ' ', True]
The peekable() and seekable() wrappers make for practical decorators:
from more_itertools import peekable peekable_function = make_decorator(peekable) @peekable_function() ... def str_range(*args): ... return (str(x) for x in range(*args)) ... it = str_range(1, 20, 2) next(it), next(it), next(it) ('1', '3', '5') it.peek() '7' next(it) '7'
class more_itertools.SequenceView(target)[source]¶
Return a read-only view of the sequence object target.
SequenceView objects are analogous to Python’s built-in “dictionary view” types. They provide a dynamic view of a sequence’s items, meaning that when the sequence updates, so does the view.
seq = ['0', '1', '2'] view = SequenceView(seq) view SequenceView(['0', '1', '2']) seq.append('3') view SequenceView(['0', '1', '2', '3'])
Sequence views support indexing, slicing, and length queries. They act like the underlying sequence, except they don’t allow assignment:
view[1] '1' view[1:-1] ['1', '2'] len(view) 4
Sequence views are useful as an alternative to copying, as they don’t require (much) extra storage.
more_itertools.time_limited(limit_seconds, iterable)[source]¶
Yield items from iterable until limit_seconds have passed. If the time limit expires before all items have been yielded, thetimed_out
parameter will be set to True
.
from time import sleep def generator(): ... yield 1 ... yield 2 ... sleep(0.2) ... yield 3 iterable = time_limited(0.1, generator()) list(iterable) [1, 2] iterable.timed_out True
Note that the time is checked before each item is yielded, and iteration stops if the time elapsed is greater than limit_seconds. If your time limit is 1 second, but it takes 2 seconds to generate the first item from the iterable, the function will run for 2 seconds and not yield anything. As a special case, when limit_seconds is zero, the iterator never returns anything.
more_itertools.map_if(iterable, pred, func, func_else=lambda x: ...)[source]¶
Evaluate each item from iterable using pred. If the result is equivalent to True
, transform the item with func and yield it. Otherwise, transform the item with func_else and yield it.
pred, func, and func_else should each be functions that accept one argument. By default, func_else is the identity function.
from math import sqrt iterable = list(range(-5, 5)) iterable [-5, -4, -3, -2, -1, 0, 1, 2, 3, 4] list(map_if(iterable, lambda x: x > 3, lambda x: 'toobig')) [-5, -4, -3, -2, -1, 0, 1, 2, 3, 'toobig'] list(map_if(iterable, lambda x: x >= 0, ... lambda x: f'{sqrt(x):.2f}', lambda x: None)) [None, None, None, None, None, '0.00', '1.00', '1.41', '1.73', '2.00']
more_itertools.doublestarmap(func, iterable)[source]¶
Apply func to every item of iterable by dictionary unpacking the item into func.
The difference between itertools.starmap()
and doublestarmap()parallels the distinction between func(*a)
and func(**a)
.
iterable = [{'a': 1, 'b': 2}, {'a': 40, 'b': 60}] list(doublestarmap(lambda a, b: a + b, iterable)) [3, 100]
TypeError
will be raised if _func_’s signature doesn’t match the mapping contained in iterable or if iterable does not contain mappings.
Itertools recipes
more_itertools.iter_index(iterable, value, start=0, stop=None)[source]¶
Yield the index of each place in iterable that value occurs, beginning with index start and ending before index stop.
list(iter_index('AABCADEAF', 'A')) [0, 1, 4, 7] list(iter_index('AABCADEAF', 'A', 1)) # start index is inclusive [1, 4, 7] list(iter_index('AABCADEAF', 'A', 1, 7)) # stop index is not inclusive [1, 4]
The behavior for non-scalar values matches the built-in Python types.
list(iter_index('ABCDABCD', 'AB')) [0, 4] list(iter_index([0, 1, 2, 3, 0, 1, 2, 3], [0, 1])) [] list(iter_index([[0, 1], [2, 3], [0, 1], [2, 3]], [0, 1])) [0, 2]
See locate() for a more general means of finding the indexes associated with particular values.
more_itertools.consume(iterator, n=None)[source]¶
Advance iterable by n steps. If n is None
, consume it entirely.
Efficiently exhausts an iterator without returning values. Defaults to consuming the whole iterator, but an optional second argument may be provided to limit consumption.
i = (x for x in range(10)) next(i) 0 consume(i, 3) next(i) 4 consume(i) next(i) Traceback (most recent call last): File "", line 1, in StopIteration
If the iterator has fewer items remaining than the provided limit, the whole iterator will be consumed.
i = (x for x in range(3)) consume(i, 5) next(i) Traceback (most recent call last): File "", line 1, in StopIteration
more_itertools.tabulate(function, start=0)[source]¶
Return an iterator over the results of func(start)
,func(start + 1)
, func(start + 2)
…
func should be a function that accepts one integer argument.
If start is not specified it defaults to 0. It will be incremented each time the iterator is advanced.
square = lambda x: x ** 2 iterator = tabulate(square, -3) take(4, iterator) [9, 4, 1, 0]
more_itertools.repeatfunc(func, times=None, *args)[source]¶
Call func with args repeatedly, returning an iterable over the results.
If times is specified, the iterable will terminate after that many repetitions:
from operator import add times = 4 args = 3, 5 list(repeatfunc(add, times, *args)) [8, 8, 8, 8]
If times is None
the iterable will not terminate:
from random import randrange times = None args = 1, 11 take(6, repeatfunc(randrange, times, *args)) [2, 4, 8, 1, 8, 4]
more_itertools.reshape(matrix, cols)[source]¶
Reshape the 2-D input matrix to have a column count given by cols.
matrix = [(0, 1), (2, 3), (4, 5)] cols = 3 list(reshape(matrix, cols)) [(0, 1, 2), (3, 4, 5)]
more_itertools.loops(n)[source]¶
Returns an iterable with n elements for efficient looping. Like range(n)
but doesn’t create integers.
i = 0 for _ in loops(5): ... i += 1 i 5