When you have a list of identifiers and want to retrieve the corresponding instances, the usual reflex in Django is filter(pk__in=[...]). It works — one SQL query. But in_bulk() is an often-overlooked ORM optimization: it returns a dictionary {id: instance} instead of a QuerySet, which fundamentally changes how you access results. Where filter() forces an O(n) traversal to find an object by ID, in_bulk() gives direct O(1) access.

in_bulk() signature and behavior

QuerySet.in_bulk(id_list=(), *, field_name='pk')
  • id_list: list of identifiers to retrieve. If omitted (called without arguments), returns all objects in the table.
  • field_name: field used as the dictionary key. Must have unique=True, otherwise Django raises a ValueError.

The generated SQL is a simple WHERE pk IN (...) clause — one query regardless of list size.

in_bulk() vs filter(): O(1) access instead of O(n)

# filter() → QuerySet, O(n) access
contrats: list[Contract] = list(Contract.objects.filter(pk__in=[1, 2, 3]))
contrat: Contract | None = next((c for c in contrats if c.pk == 2), None)

# in_bulk() → dict, O(1) access
contrats_map: dict[int, Contract] = Contract.objects.in_bulk([1, 2, 3])
# → {1: <Contract pk=1>, 2: <Contract pk=2>, 3: <Contract pk=3>}

contrat = contrats_map.get(2)  # direct access, None if absent

IDs not found in the database simply don’t appear in the returned dictionary. No error, no None value: missing key = object doesn’t exist.

in_bulk() with field_name: index by any unique field

in_bulk() accepts any unique=True field via field_name:

# By unique reference
refs: list[str] = ['REF-001', 'REF-002', 'REF-003']
contrats_map: dict[str, Contract] = Contract.objects.in_bulk(
    refs,
    field_name='reference'
)
# → {'REF-001': <Contract ...>, 'REF-002': <Contract ...>, ...}

contrat: Contract | None = contrats_map.get('REF-002')

Particularly useful during data synchronizations where the business identifier isn’t the PK.

Django use cases: when in_bulk() makes the difference

Hydrating multiple aggregates in one query

In a DDD context, loading multiple aggregates from a list of IDs:

ids: list[int] = [event.contract_id for event in events]
contrats_map: dict[int, Contract] = Contract.objects.in_bulk(ids)

for event in events:
    contrat: Contract | None = contrats_map.get(event.contract_id)
    if contrat:
        contrat.apply(event)

One query for all contracts, then direct ID lookup in the loop.

Avoiding N+1 during imports

from decimal import Decimal

def import_rows(csv_rows: list[dict[str, str]]) -> None:
    references: list[str] = [row['ref'] for row in csv_rows]
    existing: dict[str, Product] = Product.objects.in_bulk(
        references, field_name='reference'
    )

    to_create: list[Product] = []
    to_update: list[Product] = []

    for row in csv_rows:
        if row['ref'] in existing:
            product = existing[row['ref']]
            product.price = Decimal(row['price'])
            to_update.append(product)
        else:
            to_create.append(Product(reference=row['ref'], price=row['price']))

    Product.objects.bulk_create(to_create)
    Product.objects.bulk_update(to_update, ['price'])

Classic import/sync pattern: one in_bulk() query, then bulk_create + bulk_update. Zero N+1.

Fetching all objects from a table

# Loads the entire table into memory — reserve for small tables
config: dict[int, AppSetting] = AppSetting.objects.in_bulk()
value: str = config[42].value

Handy for reference tables (countries, currencies, settings) queried frequently.

Optimizing in_bulk() on large lists with chunking

For lists of thousands of IDs, the IN(...) clause can get heavy on the database side. The solution: split into batches.

from collections.abc import Iterator
from itertools import islice
from typing import Any

from django.db.models import QuerySet


def chunked(iterable: list[Any], size: int) -> Iterator[list[Any]]:
    it = iter(iterable)
    while chunk := list(islice(it, size)):
        yield chunk


def in_bulk_chunked(
    queryset: QuerySet,
    ids: list[Any],
    chunk_size: int = 500,
    field_name: str = 'pk',
) -> dict[Any, Any]:
    result: dict[Any, Any] = {}
    for chunk in chunked(ids, chunk_size):
        result.update(queryset.in_bulk(chunk, field_name=field_name))
    return result


# Usage
contracts: dict[int, Contract] = in_bulk_chunked(
    Contract.objects, list_of_5000_ids
)

Summary: in_bulk() vs filter() in Django

filter(pk__in=[...])in_bulk([...])
ReturnQuerySet (list)dict {id: instance}
Access by IDO(n) — traversalO(1) — direct key
SQL queries11
Missing IDssilently ignoredkey absent from dict
field_namenoyes (unique=True required)

in_bulk() isn’t a universal replacement for filter(). It’s a specific tool: when you have IDs and want direct key-based access, it’s the right choice. For everything else, filter() remains perfectly suited.


Working on Django performance topics? Check out why AI makes learning to code more essential than ever.