itertools is a standard library module that exposes composable iteration building blocks. Its value is not in replacing a for loop with a cryptically named function, but in processing data streams without ever loading them fully into memory. Every function returns a lazy iterator: nothing is computed until you consume the result. That is what lets you chain transformations over millions of elements with a constant memory footprint.
Here are the functions I actually use, grouped by purpose, along with the pitfalls that waste time.
Infinite iterators: count, cycle, repeat
These three functions produce endless streams. They are only useful with a stopping condition, otherwise the loop never terminates.
count(start, step) generates an infinite arithmetic sequence. Handy for numbering without managing a manual counter.
from itertools import count, islice
for i in count(10, 2):
if i > 20:
break
print(i) # 10, 12, 14, 16, 18, 20
cycle(iterable) repeats an iterable indefinitely. Typical for alternating between resources (round-robin over servers, colors, workers).
from itertools import cycle
colors = cycle(['red', 'green', 'blue'])
for _, color in zip(range(5), colors):
print(color) # red, green, blue, red, green
repeat(elem, times) repeats a value. Without times, it is infinite. Its main use is supplying a constant argument to map or starmap.
from itertools import repeat
list(repeat(7, 3)) # [7, 7, 7]
list(map(pow, range(5), repeat(2))) # [0, 1, 4, 9, 16] — each base squared
The pitfall. count(), cycle(), and repeat() without times never stop. Always bound them with islice, takewhile, a zip over a finite sequence, or a break. A list(count()) freezes the process.
Slicing and filtering a stream
islice(iterable, start, stop, step) applies slice-style cutting to any iterator, including infinite ones, without materializing it into a list. It is the go-to tool for bounding a stream.
from itertools import islice, count
list(islice(count(), 5)) # [0, 1, 2, 3, 4]
list(islice(count(), 2, 8, 2)) # [2, 4, 6]
takewhile(pred, iterable) returns elements while the predicate holds, then stops at the first failure. dropwhile does the opposite: it skips elements while the predicate holds, then returns everything else without re-evaluating.
from itertools import takewhile, dropwhile
data = [2, 3, 8, 1, 9, 4]
list(takewhile(lambda x: x < 5, data)) # [2, 3] — stops at 8
list(dropwhile(lambda x: x < 5, data)) # [8, 1, 9, 4] — skips 2 and 3, keeps the rest
filterfalse(pred, iterable) is the complement of filter: it keeps the elements for which the predicate is false. More readable than filter(lambda x: not pred(x), ...).
from itertools import filterfalse
list(filterfalse(lambda x: x % 2, range(10))) # [0, 2, 4, 6, 8] — the even numbers
compress(data, selectors) filters data according to a second iterable of booleans. Useful when the selection mask is computed elsewhere, separately from the data.
from itertools import compress
names = ['Alice', 'Bob', 'Carol', 'Dan']
active = [True, False, True, False]
list(compress(names, active)) # ['Alice', 'Carol']
Combining and transforming
chain(*iterables) strings several iterables into a single sequence, without creating an intermediate list. Its chain.from_iterable variant flattens an iterable of iterables, ideal for lazy streams.
from itertools import chain
list(chain([1, 2], [3, 4], [5])) # [1, 2, 3, 4, 5]
list(chain.from_iterable([[1, 2], [3, 4]])) # [1, 2, 3, 4]
accumulate(iterable, func, initial) produces running results. By default it is a cumulative sum, but any binary function works (max, operator.mul, etc.). The initial parameter (Python 3.8+) sets a starting value.
from itertools import accumulate
import operator
list(accumulate([1, 2, 3, 4])) # [1, 3, 6, 10] — running sum
list(accumulate([1, 2, 3, 4], operator.mul)) # [1, 2, 6, 24] — running product
list(accumulate([3, 1, 4, 1, 5], max)) # [3, 3, 4, 4, 5] — running maximum
starmap(func, iterable) applies a function to arguments already grouped into tuples. It is map when the arguments are pre-packed: starmap(f, [(a, b)]) calls f(a, b).
from itertools import starmap
points = [(3, 4), (6, 8), (5, 12)]
list(starmap(lambda x, y: (x**2 + y**2)**0.5, points)) # [5.0, 10.0, 13.0]
pairwise(iterable) (Python 3.10+) returns overlapping consecutive pairs of elements. Perfect for computing differences or comparing each element to the next.
from itertools import pairwise
list(pairwise([1, 2, 3, 4])) # [(1, 2), (2, 3), (3, 4)]
temps = [10, 13, 12, 18]
[b - a for a, b in pairwise(temps)] # [3, -1, 6] — successive changes
zip_longest(*iterables, fillvalue) merges several iterables like zip, but aligns on the longest by filling gaps with fillvalue instead of stopping at the shortest.
from itertools import zip_longest
list(zip_longest([1, 2, 3], ['a', 'b'], fillvalue='?'))
# [(1, 'a'), (2, 'b'), (3, '?')]
groupby: the sort-first pitfall
groupby(iterable, key) groups consecutive elements sharing the same key. The important word is consecutive: groupby only groups adjacent runs, it does not sort. On input not sorted by the key, you get fragmented groups.
from itertools import groupby
data = [('FR', 'Paris'), ('US', 'NYC'), ('FR', 'Lyon')]
# Wrong: not sorted by country → 'FR' shows up in two separate groups
for country, group in groupby(data, key=lambda x: x[0]):
print(country, [v for _, v in group])
# FR ['Paris']
# US ['NYC']
# FR ['Lyon']
# Right: sort first by the same key
data.sort(key=lambda x: x[0])
for country, group in groupby(data, key=lambda x: x[0]):
print(country, [v for _, v in group])
# FR ['Paris', 'Lyon']
# US ['NYC']
This is the most common mistake with this function: forgetting the sort with the same key as the groupby. Another subtlety, the group object is a shared iterator: if you move to the next group without consuming the previous one, its contents are lost. The key function pairs well with operator.itemgetter, faster and more readable than a lambda.
tee: duplicating an iterator, with caution
tee(iterable, n) returns n independent iterators from a single one. It does not copy the data: the iterators share an internal buffer that holds everything the slowest one has not yet consumed.
from itertools import tee
it = iter([1, 2, 3, 4])
a, b = tee(it, 2)
list(a) # [1, 2, 3, 4]
list(b) # [1, 2, 3, 4]
Two real pitfalls. First, do not touch the source iterable after tee: continuing to consume it desynchronizes the copies. Second, if one iterator races far ahead of the other, the internal buffer grows to hold the pending elements. Duplicating a stream then fully consuming the first copy before the second amounts to keeping everything in memory, which defeats the point of laziness. tee is efficient only if the copies advance at roughly the same pace.
Combinatorics: product, permutations, combinations
These functions generate arrangements. They stay lazy, but beware: the number of results explodes fast (factorial or exponential).
product(*iterables, repeat) computes the Cartesian product. It replaces nested for loops.
from itertools import product
list(product([1, 2], ['a', 'b'])) # [(1,'a'), (1,'b'), (2,'a'), (2,'b')]
list(product([0, 1], repeat=3)) # all binary combinations over 3 bits
permutations(iterable, r) generates all ordered arrangements of length r (order matters). combinations(iterable, r) generates unordered subsets of length r (order does not matter). combinations_with_replacement additionally allows repeating the same element.
from itertools import permutations, combinations, combinations_with_replacement
list(permutations('ABC', 2)) # AB AC BA BC CA CB — 6 arrangements
list(combinations('ABC', 2)) # AB AC BC — 3 combinations
list(combinations_with_replacement('ABC', 2)) # AA AB AC BB BC CC — 6 with replacement
The pitfall. permutations(range(10)) produces 3,628,800 tuples. Never wrap these functions in a list() without bounding the input size, or you risk exhausting memory. Iterate directly with a break or an islice when you only need a sample.
Recap
| Function | Role | Keep in mind |
|---|---|---|
count / cycle / repeat | infinite streams | bound with islice or takewhile |
islice | lazy slice | works on an infinite iterator |
takewhile / dropwhile | cut by a predicate | stops / skips at the first switch |
filterfalse / compress | filtering | complement of filter / external mask |
chain | concatenate iterables | from_iterable to flatten |
accumulate | running results | customizable func and initial |
starmap | map over tuples of arguments | f((a, b)) → f(a, b) |
pairwise | consecutive pairs | Python 3.10+, ideal for deltas |
zip_longest | zip without truncation | fills with fillvalue |
groupby | group runs | sort first by the key |
tee | duplicate an iterator | memory-heavy if desynchronized |
product / permutations / combinations | combinatorics | output explodes, bound the input |
itertools does not provide anything impossible to write by hand. It provides C-level, tested primitives that compose with each other and preserve laziness end to end. Where a list comprehension materializes everything, an itertools pipeline processes one element at a time. On large volumes, that is the difference between a script that fits in memory and one that saturates it. For complementary data structures, see the collections module.
