The previous two articles tackled idempotency on the event side: the Outbox pattern guarantees a message is published at least once, and the Inbox pattern guarantees it is consumed only once. One last place where the same problem shows up sits further upstream: the HTTP API itself.

When a client fires a POST /api/payments and the connection drops before the response comes back, the client has no way to know whether the payment was created. If it retries, it risks paying twice. If it does not retry, it risks not paying at all. The Idempotency Key pattern, popularized by Stripe and adopted since by most payment APIs, solves that dilemma by putting retry control in the client’s hands.

This is the fourth article in the distributed architecture pattern series, and the API-facing counterpart of the Outbox and Inbox patterns.

The problem: a POST is not replayable

A typical endpoint:

@api_view(["POST"])
def create_payment(request):
    payment = Payment.objects.create(
        amount=request.data["amount"],
        customer_id=request.user.id,
    )
    stripe.Charge.create(amount=payment.amount, ...)
    return Response(PaymentSerializer(payment).data, status=201)

Four scenarios make a client send the same request twice:

  1. The request succeeded on the server, but the network dropped before the response arrived. The client retries, convinced nothing happened.
  2. The user double-clicks “Pay”. Two requests go out simultaneously.
  3. A mobile in an unstable area auto-retries any request that did not receive a response within 30 seconds.
  4. A load balancer or front-line proxy retries a request it thinks timed out.

In all those cases, two identical requests reach the server. Without protection, two payments are created, two Stripe charges are emitted, the customer is billed twice. The server-side Outbox pattern does not help here: the problem sits upstream, before the event is even written.

The principle: the client supplies the key

The central idea flips responsibility: it is not the server’s job to guess whether a request is a duplicate, it is the client’s job to flag it. Before sending a request, the client generates a UUID and passes it as a header:

POST /api/payments
Idempotency-Key: 7f3a1b9e-2c4d-4e8f-9a1b-3c5d7e9f1a2b
Content-Type: application/json

{"amount": 4990, "currency": "EUR"}

If the request fails (timeout, network error, server outage), the client retries with the same key. The server recognizes the key, sees it has already processed that request, and returns the stored response without re-executing the operation.

If the client crashes before recording the key it used, it will have to generate a new one on the next try, and the request will be treated as new. That is a trade-off: protection against network duplicates, not against full client-side state loss. Most official SDKs (Stripe, Square, AWS) handle that persistence for their users.

Server-side storage

A minimal schema:

from django.db import models


class IdempotencyKey(models.Model):
    key = models.CharField(max_length=128, primary_key=True)
    user_id = models.BigIntegerField()
    request_hash = models.CharField(max_length=64)
    response_status = models.IntegerField(null=True)
    response_body = models.JSONField(null=True)
    created_at = models.DateTimeField(auto_now_add=True)
    completed_at = models.DateTimeField(null=True)

    class Meta:
        indexes = [
            models.Index(fields=["created_at"]),
        ]

A few structuring choices.

The primary key is the combination of the key and implicitly the user (through user_id). Two different clients that happened to pick the same UUID stay isolated. In practice, we prefer storing the key as f"{user_id}:{key}" to make the isolation explicit and benefit from a single PK index.

request_hash is a SHA-256 of the request body. It catches a dangerous case: a client reusing a key for a different request. If the hash does not match the stored one, we return 422 Unprocessable Entity instead of the previous response.

response_status and response_body store what we returned on the first call. On replay, we return them as is.

completed_at distinguishes three states: NULL = processing in flight, not NULL = completed. That distinction is critical to handle concurrent requests.

The idempotency middleware

The full flow looks like this:

import hashlib, json
from django.db import IntegrityError, transaction
from rest_framework.response import Response


def idempotency_middleware(view_func):
    def wrapper(request, *args, **kwargs):
        key = request.headers.get("Idempotency-Key")
        if not key or request.method != "POST":
            return view_func(request, *args, **kwargs)

        body_hash = hashlib.sha256(request.body).hexdigest()
        composite_key = f"{request.user.id}:{key}"

        try:
            with transaction.atomic():
                IdempotencyKey.objects.create(
                    key=composite_key,
                    user_id=request.user.id,
                    request_hash=body_hash,
                )
        except IntegrityError:
            existing = IdempotencyKey.objects.get(key=composite_key)
            if existing.request_hash != body_hash:
                return Response(
                    {"error": "Idempotency-Key reused with different body"},
                    status=422,
                )
            if existing.completed_at is None:
                return Response(
                    {"error": "Request already in flight"},
                    status=409,
                )
            return Response(existing.response_body, status=existing.response_status)

        response = view_func(request, *args, **kwargs)

        IdempotencyKey.objects.filter(key=composite_key).update(
            response_status=response.status_code,
            response_body=response.data,
            completed_at=timezone.now(),
        )
        return response
    return wrapper

The pattern rests entirely on the primary key uniqueness constraint, exactly like the Inbox pattern: two simultaneous requests with the same key trigger an IntegrityError on the second one, ruling out any race between SELECT and INSERT. The database is the arbiter, not the application code.

The concurrent retry case

The 409 Conflict when completed_at IS NULL deserves an explanation. Picture this: a client fires a request with key K that takes 5 seconds to process (Stripe call, PDF generation). During those 5 seconds, the network drops, the client retries with the same key K.

Without the completed_at IS NULL branch, the server would find the existing idempotency row, have no response to replay (since the first request did not finish), and probably return inconsistent state. The 409 tells the client: “your first request is still in flight, wait and retry in a few seconds”.

That is the HTTP equivalent of the select_for_update(skip_locked=True) we saw in the Outbox relay: we prevent two concurrent requests from doing the same work in parallel.

The body-mismatch pitfall

The request_hash is not a formality. Without it, a client bug that reused the same key for a different request would always see the first response and never know that the second request was never processed.

Typical scenario: a script looping over 500 payments with a fixed key by mistake. Without the hash, the next 499 silently return the first response. No payment created. The error stays invisible until the monthly billing.

With the hash, the second call gets an immediate 422. The bug surfaces, the script crashes, the operator can step in before the damage spreads.

TTL and retention

An idempotency key is not meant to live forever. The typical lifetime is 24 hours, sometimes 7 days for heavy financial operations. Past that, we assume the client has given up retrying and free the key.

Purging happens via a periodic Celery job:

@shared_task
def purge_idempotency_keys():
    cutoff = timezone.now() - timedelta(hours=24)
    IdempotencyKey.objects.filter(created_at__lt=cutoff).delete()

The index on created_at keeps that query fast. On PostgreSQL, we can go further with day-based partitioning if volume justifies the complexity.

Idempotency Keys vs Inbox: two layers, same idea

Both patterns rest on the same primitive (a uniqueness constraint plus an atomic transaction), but apply to different layers.

The Idempotency Key protects the HTTP boundary: between client and API. The key is generated by the client, transported in a header, and stored by the server alongside the response.

The Inbox protects the broker boundary: between Kafka and the consumer. The identifier is generated by the producer, transported in the event payload, and stored by the consumer with no response to replay (events have no response).

A modern distributed system often combines both. The public API exposes an endpoint with Idempotency-Key that creates an order and publishes an event through the Outbox. Downstream consumers use the Inbox to avoid processing the same event twice. Every boundary has its own protection, and the whole system becomes resilient to duplicates at every layer.

When not to use an Idempotency Key

HTTP already defines some methods as idempotent. GET, PUT, DELETE must produce the same state when repeated: a DELETE /resource/1 repeated a thousand times must end with the resource absent. Adding an idempotency key on those endpoints is redundant.

The pattern is actually useful on:

  • POST endpoints that create a resource with a business effect (payment, order, transfer, send)
  • POST endpoints that trigger an expensive external action (email, PDF generation, partner call)
  • public endpoints exposed to clients whose retry logic we do not control

Conversely, for an internal endpoint between microservices where retry control is shared, or for a read-only API, the implementation and storage cost is not worth it.

Conclusion

The Idempotency Key is the least glamorous of the three patterns, but probably the most visible when it is missing. A customer paying twice does not forgive a botched retry. A customer seeing 409 Conflict or 422 Idempotency-Key reused understands something happened and can react.

It is also the pattern that closes the series. With Saga, Outbox, Inbox and Idempotency Keys, we have covered the four boundaries where a distributed system can lose consistency: between workflow steps, between a database and a broker, between a broker and a consumer, and between an HTTP client and the server. Each one protects a specific place. None of them is enough on its own.