Python operator: itemgetter, attrgetter and the art of replacing lambdas

The operator library has been part of Python’s standard library forever, and yet many developers keep writing lambda x: x[0] or lambda obj: obj.name when an operator function would do the same job, faster and more readably. Understanding what this library offers, and how it is implemented, changes the way you write functional code in Python.

What operator contains

operator exposes functions that match the language’s operators. operator.add(2, 3) is the functional equivalent of 2 + 3, operator.lt(a, b) corresponds to a < b. The point is not to replace operators in ordinary arithmetic code, that would be absurd. The point is being able to pass an operation as an argument to a higher-order function (map, filter, sorted, reduce, functools.partial).

import operator
from functools import reduce

# Sum of a list, no lambda
total = reduce(operator.add, [1, 2, 3, 4])  # 10

# Cumulative product
product = reduce(operator.mul, [1, 2, 3, 4])  # 24

operator.add is a reference to a function compiled in C in CPython. Passing it to reduce avoids creating a Python frame on every call, which adds up on large volumes.

What is a frame? On every Python function call, the interpreter allocates a frame object that holds the local variables, the evaluation stack, the current position in the bytecode and a reference to the calling frame. That is what shows up in a stack trace. Allocating a frame is not free: you have to create the object, initialize its fields, and tear it down on return. A function written in C (like operator.add or itemgetter) needs no Python frame, it runs directly inside the VM. That saving, multiplied by millions of iterations, is what makes operator measurable.

itemgetter: access by key or index

operator.itemgetter(key) returns a callable that, applied to an object, returns obj[key]. It is the functional equivalent of lambda obj: obj[key], only faster.

from operator import itemgetter

data = [(1, "a"), (3, "c"), (2, "b")]

# With lambda
sorted(data, key=lambda t: t[0])

# With itemgetter
sorted(data, key=itemgetter(0))

On such a small dataset the difference is invisible. On a list of 100,000 tuples, itemgetter is typically 20% to 40% faster than the equivalent lambda, because the extraction happens in C without going back through the Python interpreter on every call.

itemgetter accepts multiple keys and then returns a tuple:

from operator import itemgetter

users = [
    {"name": "Alice", "age": 30, "city": "Paris"},
    {"name": "Bob", "age": 25, "city": "Lyon"},
    {"name": "Alice", "age": 28, "city": "Lyon"},
]

# Multi-criteria sort: by name then by age
sorted_users = sorted(users, key=itemgetter("name", "age"))

The lambda equivalent would be lambda u: (u["name"], u["age"]). Readable, but more verbose and slower.

attrgetter: access to attributes

attrgetter does for attributes what itemgetter does for keys. It also supports nested attributes via dotted notation.

from operator import attrgetter
from dataclasses import dataclass

@dataclass
class Address:
    city: str
    postal_code: str

@dataclass
class User:
    name: str
    address: Address

users = [
    User("Alice", Address("Paris", "75001")),
    User("Bob", Address("Lyon", "69001")),
]

# Sort by city (nested attribute)
sorted(users, key=attrgetter("address.city"))

# Parallel extraction of several attributes
get_name_city = attrgetter("name", "address.city")
get_name_city(users[0])  # ('Alice', 'Paris')

The benefit over lambda u: u.address.city is not just theoretical. On a large collection, C-level attribute access saves Python frame allocations. And the signature is self-documenting: attrgetter("address.city") says exactly what it extracts.

methodcaller: calling a method with fixed arguments

methodcaller(name, *args, **kwargs) returns a callable that calls the method name on its argument, passing it args and kwargs.

from operator import methodcaller

phrases = ["hello", "world", "python"]
uppercased = list(map(methodcaller("upper"), phrases))
# ['HELLO', 'WORLD', 'PYTHON']

# With arguments
csv = ["a,b,c", "d,e,f"]
splits = list(map(methodcaller("split", ","), csv))
# [['a', 'b', 'c'], ['d', 'e', 'f']]

This is particularly useful when building a transformation pipeline without writing a lambda per step. The equivalent lambda s: s.split(",") works, but methodcaller("split", ",") is slightly faster and makes intent explicit.

Arithmetic and comparison operators

operator exposes all the language’s operators as functions:

Operation	Function
`a + b`	`operator.add(a, b)`
`a - b`	`operator.sub(a, b)`
`a * b`	`operator.mul(a, b)`
`a / b`	`operator.truediv(a, b)`
`a // b`	`operator.floordiv(a, b)`
`a % b`	`operator.mod(a, b)`
`a ** b`	`operator.pow(a, b)`
`a < b`	`operator.lt(a, b)`
`a <= b`	`operator.le(a, b)`
`a == b`	`operator.eq(a, b)`
`a > b`	`operator.gt(a, b)`
`a is b`	`operator.is_(a, b)`
`not a`	`operator.not_(a)`

All also exist in an “in-place” form for augmented operators (iadd, imul, etc.), which correspond to a += b. On immutable types like int or tuple, iadd behaves like add because these types cannot be modified in place. On a list, iadd mutates the object and returns it, which is the standard += behavior in Python. For the mechanism details, see Python add and iadd: the difference that changes everything.

Advanced use cases

Stable multi-pass sort. To sort by several criteria in different directions (ascending then descending), sorted is stable, so you chain the sorts from least to most important criterion:

from operator import itemgetter

# Sort by age descending, then by name ascending (on age ties)
data = [
    {"name": "Alice", "age": 30},
    {"name": "Bob", "age": 30},
    {"name": "Charlie", "age": 25},
]

# First pass: by name (secondary criterion)
data = sorted(data, key=itemgetter("name"))
# Second pass: by age descending (primary criterion)
data = sorted(data, key=itemgetter("age"), reverse=True)

Group-by with itertools.groupby. groupby requires a key callable, and itemgetter is the natural tool:

from itertools import groupby
from operator import itemgetter

sales = [
    {"region": "EU", "amount": 100},
    {"region": "EU", "amount": 200},
    {"region": "US", "amount": 150},
]

# groupby requires the data to be sorted by the grouping key
sales_sorted = sorted(sales, key=itemgetter("region"))
for region, items in groupby(sales_sorted, key=itemgetter("region")):
    total = sum(v["amount"] for v in items)
    print(region, total)

Reduction with operators. All the classic aggregations can be reduced with reduce and an operator:

from functools import reduce
from operator import add, mul, and_, or_

# List concatenation (avoid on large volumes)
reduce(add, [[1, 2], [3, 4], [5, 6]])  # [1, 2, 3, 4, 5, 6]

# Set intersection
reduce(and_, [{1, 2, 3}, {2, 3, 4}, {3, 4, 5}])  # {3}

# Set union
reduce(or_, [{1, 2}, {2, 3}, {3, 4}])  # {1, 2, 3, 4}

For numeric sum, sum() is still more idiomatic and faster than reduce(add, ...). For product, math.prod() has existed since Python 3.8.

operator vs lambda: the measurement

The performance gain comes from the C implementation. A small benchmark sorting 100,000 dictionaries:

import timeit
from operator import itemgetter

data = [{"k": i, "v": i * 2} for i in range(100_000)]

t_lambda = timeit.timeit(
    lambda: sorted(data, key=lambda d: d["k"]),
    number=100,
)

t_getter = timeit.timeit(
    lambda: sorted(data, key=itemgetter("k")),
    number=100,
)

print(f"lambda:     {t_lambda:.2f}s")
print(f"itemgetter: {t_getter:.2f}s")

On CPython 3.12, a ratio of 1.2x to 1.5x in favor of itemgetter is typical depending on the machine. Not spectacular, but free, and the code is shorter. On critical operations inside tight loops (map, filter over millions of items), the gain becomes visible.

The other dimension is picklability. itemgetter("k") is picklable, while a lambda is not. This matters when passing a key function to a process pool through multiprocessing:

from multiprocessing import Pool
from operator import itemgetter

with Pool(4) as pool:
    # Works
    result = pool.map(itemgetter("k"), data)

    # Fails: PicklingError on the lambda
    # result = pool.map(lambda d: d["k"], data)

When not to use operator

operator is no silver bullet. When the extraction logic is non-trivial (computation, condition, conditional access), a named function is more readable:

# Bad usage: unreadable
sorted(data, key=lambda x: itemgetter("score")(x) if x["active"] else 0)

# Better: named function
def sort_key(x):
    return x["score"] if x["active"] else 0

sorted(data, key=sort_key)

operator shines on simple, repetitive cases. Beyond that, a dedicated function keeps the code clear.

What to take away

The operator library does not radically change a program’s performance, but it removes an entire category of trivial lambdas that clutter the code. itemgetter, attrgetter and methodcaller are the three tools to integrate first into daily reflexes: they make sort keys and functional pipelines self-documenting, slightly faster, and picklable. The arithmetic and comparison operators round out the palette for reduction cases. Understanding operator is learning to express operations as first-class values, one of Python’s underused strengths.

What operator contains#

itemgetter: access by key or index#

attrgetter: access to attributes#

methodcaller: calling a method with fixed arguments#

Arithmetic and comparison operators#

Advanced use cases#

operator vs lambda: the measurement#

When not to use operator#

What to take away#

Newsletter