The previous article on the Transactional Outbox set a clear guarantee: every event written to the database will eventually be published. That guarantee is intentionally at-least-once. A consumer may receive the same event two times, three times, or more if the network behaves badly. The Outbox pattern never promises uniqueness.
The consequence follows immediately: if the consumer applies the effect of the message twice, it bills twice, sends two emails, decreases stock twice. The consistency guaranteed on the producer side collapses on the reader side.
The Inbox pattern fixes that. It guarantees that an event received several times is processed only once, leveraging the same primitive as the Outbox: a local SQL transaction. This is the third article in the distributed architecture pattern series, after the Saga pattern and the Outbox pattern.
The problem: at-least-once is not exactly-once
Back to the e-commerce example. A Billing service consumes a Kafka topic orders.confirmed. For every message, it generates a PDF invoice and emails it to the customer.
def handle_order_confirmed(event: dict) -> None:
invoice = Invoice.objects.create(order_id=event["id"])
pdf = render_pdf(invoice)
email.send(invoice.customer_email, pdf)
Four reasons make a message arrive twice:
- The Outbox relay on the producer side retries after a crash and republishes an already-published event.
- The consumer worker processes the message, commits to the database, but crashes before acking Kafka. On restart, Kafka re-sends it.
- A Kafka consumer-group rebalance reassigns a partition to another worker, which replays the last offset.
- An operator replays a topic from a previous offset for debugging.
Without protection, the invoice is created twice, the PDF generated twice, the email sent twice. The customer calls support. The exactly-once promise no distributed system actually keeps turns into visible at-least-twice.
The principle: tracking what has already been processed
The Inbox pattern relies on an idea symmetric to the Outbox: before processing a message, record its identifier in an inbox table. If the insert succeeds, process. If it fails because the identifier already exists, skip the message. Everything happens inside a single atomic SQL transaction, which rules out the classic race between “check” and “write”.
The essential constraint: every event published by the Outbox must carry a stable, unique identifier. Not a timestamp, not a reconstructible concatenation, but a UUID generated at write time on the producer side and transported as-is in the Kafka payload.
The inbox table
A minimal schema on the consumer side:
from django.db import models
class InboxEvent(models.Model):
event_id = models.UUIDField(primary_key=True)
consumer = models.CharField(max_length=64)
processed_at = models.DateTimeField(auto_now_add=True)
class Meta:
indexes = [
models.Index(fields=["consumer", "processed_at"]),
]
Two details matter.
event_id is the primary key. The uniqueness constraint is the guarantee of idempotency: two inserts of the same event_id raise an IntegrityError at the database level. That is what makes the pattern bullet-proof, regardless of application-side concurrency.
The consumer column lets several consuming services share the table without collision. The Billing service and the Analytics service can process the same event independently, each with its own inbox row.
Atomic processing
The message handler becomes:
from django.db import IntegrityError, transaction
def handle_order_confirmed(event: dict) -> None:
try:
with transaction.atomic():
InboxEvent.objects.create(
event_id=event["event_id"],
consumer="billing",
)
invoice = Invoice.objects.create(order_id=event["order_id"])
except IntegrityError:
return
pdf = render_pdf(invoice)
email.send(invoice.customer_email, pdf)
Three choices matter here.
The insert into InboxEvent and the creation of the Invoice share the same transaction.atomic. Either both commit, or neither does. There is no way to end up with an inbox trace and no invoice, or the opposite.
IntegrityError catches the primary-key collision. It is the equivalent of INSERT ... ON CONFLICT DO NOTHING on PostgreSQL, but portable. No prior SELECT, no race condition: two workers trying simultaneously will have a single winner, decided by the database.
Sending the PDF and email happens outside the transaction. This is deliberate: you do not want to hold the lock during an SMTP call that takes several seconds. The trade-off: if the worker crashes between the commit and the email, the email will never go out. It is the classic compromise between at-least-once persistence and at-most-once external side effect, to be settled per business case.
The non-transactional side-effect pitfall
The code above carries an acknowledged flaw: if the worker dies after the commit but before email.send, the invoice exists in the database, the inbox entry too, but no email was sent. On the next rebalance, Kafka will replay the message, the inbox will block reprocessing, and the email will never go out.
Two strategies exist.
Push the side effect into the consumer’s own Outbox. Instead of sending the email directly, write an InvoiceToSend event into a local outbox table. Another worker, downstream, handles the sending with its own at-least-once guarantees. The pattern becomes Inbox → process → Outbox, and each transition is transactional.
Accept the risk and compensate via an external mechanism. A cron that detects Invoice records with no email sent after X minutes and triggers a resend. Simpler, sufficient in many cases, but requires business visibility on “what is an invoice waiting for an email”.
The first is more rigorous, the second more pragmatic. Choose by the criticality of the effect and the tolerance for delay.
Purging the inbox
Like the Outbox, the Inbox grows forever if left unmaintained. At 10,000 events per day, one year = 3.5 million rows. The primary key stays fast for lookups, but the table eats space.
Two approaches.
Short TTL with business-driven retention. Keep processed event_id values for the maximum possible replay window on the broker side (typically the Kafka retention, 7 or 14 days), then delete. Beyond that, replaying a very old event cannot happen anyway. A weekly purge command is enough.
Date-based partitioning. For really high volumes, partition InboxEvent by month using PostgreSQL’s native features. Dropping a full partition is instant, whereas a massive DELETE can block.
Outbox + Inbox: closing the loop
With both patterns combined, the trajectory of an event becomes:
- Service A writes the event to its
outboxwithin the business transaction. - The relay publishes to Kafka, at-least-once guarantee.
- Service B receives the message, records it in its
inboxwithin the transaction that applies the business effect. - If the message arrives a second time, the
IntegrityErroron the inbox blocks reprocessing.
The result is not exactly-once in the strict sense. It is effectively-once: the system behaves as if every event had only been processed once, even though technically it may have been received several times. That is the guarantee actually reachable in a distributed system.
When the inbox is unnecessary
The pattern adds a table, an index, a schema constraint on the payload (the event_id). For a consumer whose operation is naturally idempotent, it is overengineering.
If the handler runs Order.objects.filter(id=...).update(status="confirmed"), replaying it ten times produces the same result as running it once. Deduplication is implicit, the inbox is useless.
The Inbox becomes relevant once:
- the operation has a non-idempotent business effect (creating a resource, charging, sending, incrementing a counter)
- losing the event is unacceptable, so a “skip if already seen” approximation will not do
- multiple consumers process the same queue and risk stealing each other’s work
Conversely, for a consumer that only updates a deterministic state from the payload, write a handler idempotent by construction.
Conclusion
The Inbox is not a standalone pattern. It is the missing half of the Outbox: without it, the at-least-once guarantee becomes user-visible duplicates. With it, the full producer-broker-consumer chain behaves predictably, despite every possible failure mode in between.
The implementation cost is small: a table, a uniqueness constraint, a transaction.atomic. The benefit is to turn a system where “we just need a message not to arrive twice” into a system where “a message arriving a hundred times changes nothing”. Exactly the kind of guarantee you only appreciate in production, the day Kafka replays an offset by accident.
