When you have a list of identifiers and want to retrieve the corresponding instances, the usual reflex in Django is filter(pk__in=[...]). It works — one SQL query. But in_bulk() is an often-overlooked ORM optimization: it returns a dictionary {id: instance} instead of a QuerySet, which fundamentally changes how you access results. Where filter() forces an O(n) traversal to find an object by ID, in_bulk() gives direct O(1) access.
in_bulk() signature and behavior
QuerySet.in_bulk(id_list=(), *, field_name='pk')
id_list: list of identifiers to retrieve. If omitted (called without arguments), returns all objects in the table.field_name: field used as the dictionary key. Must haveunique=True, otherwise Django raises aValueError.
The generated SQL is a simple WHERE pk IN (...) clause — one query regardless of list size.
in_bulk() vs filter(): O(1) access instead of O(n)
# filter() → QuerySet, O(n) access
contrats: list[Contract] = list(Contract.objects.filter(pk__in=[1, 2, 3]))
contrat: Contract | None = next((c for c in contrats if c.pk == 2), None)
# in_bulk() → dict, O(1) access
contrats_map: dict[int, Contract] = Contract.objects.in_bulk([1, 2, 3])
# → {1: <Contract pk=1>, 2: <Contract pk=2>, 3: <Contract pk=3>}
contrat = contrats_map.get(2) # direct access, None if absent
IDs not found in the database simply don’t appear in the returned dictionary. No error, no None value: missing key = object doesn’t exist.
in_bulk() with field_name: index by any unique field
in_bulk() accepts any unique=True field via field_name:
# By unique reference
refs: list[str] = ['REF-001', 'REF-002', 'REF-003']
contrats_map: dict[str, Contract] = Contract.objects.in_bulk(
refs,
field_name='reference'
)
# → {'REF-001': <Contract ...>, 'REF-002': <Contract ...>, ...}
contrat: Contract | None = contrats_map.get('REF-002')
Particularly useful during data synchronizations where the business identifier isn’t the PK.
Django use cases: when in_bulk() makes the difference
Hydrating multiple aggregates in one query
In a DDD context, loading multiple aggregates from a list of IDs:
ids: list[int] = [event.contract_id for event in events]
contrats_map: dict[int, Contract] = Contract.objects.in_bulk(ids)
for event in events:
contrat: Contract | None = contrats_map.get(event.contract_id)
if contrat:
contrat.apply(event)
One query for all contracts, then direct ID lookup in the loop.
Avoiding N+1 during imports
from decimal import Decimal
def import_rows(csv_rows: list[dict[str, str]]) -> None:
references: list[str] = [row['ref'] for row in csv_rows]
existing: dict[str, Product] = Product.objects.in_bulk(
references, field_name='reference'
)
to_create: list[Product] = []
to_update: list[Product] = []
for row in csv_rows:
if row['ref'] in existing:
product = existing[row['ref']]
product.price = Decimal(row['price'])
to_update.append(product)
else:
to_create.append(Product(reference=row['ref'], price=row['price']))
Product.objects.bulk_create(to_create)
Product.objects.bulk_update(to_update, ['price'])
Classic import/sync pattern: one in_bulk() query, then bulk_create + bulk_update. Zero N+1.
Fetching all objects from a table
# Loads the entire table into memory — reserve for small tables
config: dict[int, AppSetting] = AppSetting.objects.in_bulk()
value: str = config[42].value
Handy for reference tables (countries, currencies, settings) queried frequently.
Optimizing in_bulk() on large lists with chunking
For lists of thousands of IDs, the IN(...) clause can get heavy on the database side. The solution: split into batches.
from collections.abc import Iterator
from itertools import islice
from typing import Any
from django.db.models import QuerySet
def chunked(iterable: list[Any], size: int) -> Iterator[list[Any]]:
it = iter(iterable)
while chunk := list(islice(it, size)):
yield chunk
def in_bulk_chunked(
queryset: QuerySet,
ids: list[Any],
chunk_size: int = 500,
field_name: str = 'pk',
) -> dict[Any, Any]:
result: dict[Any, Any] = {}
for chunk in chunked(ids, chunk_size):
result.update(queryset.in_bulk(chunk, field_name=field_name))
return result
# Usage
contracts: dict[int, Contract] = in_bulk_chunked(
Contract.objects, list_of_5000_ids
)
Summary: in_bulk() vs filter() in Django
filter(pk__in=[...]) | in_bulk([...]) | |
|---|---|---|
| Return | QuerySet (list) | dict {id: instance} |
| Access by ID | O(n) — traversal | O(1) — direct key |
| SQL queries | 1 | 1 |
| Missing IDs | silently ignored | key absent from dict |
field_name | no | yes (unique=True required) |
in_bulk() isn’t a universal replacement for filter(). It’s a specific tool: when you have IDs and want direct key-based access, it’s the right choice. For everything else, filter() remains perfectly suited.
Working on Django performance topics? Check out why AI makes learning to code more essential than ever.