Idempotency and the Exactly-Once Myth

Make Retries Safe, Because Duplicates Are Inevitable

Idempotency and the Exactly-Once Myth

Why "exactly-once" delivery is a myth, how idempotency keys and dedup windows make retries safe, and an Express + Redis idempotency-key middleware sketch.

9 min read Level 4/5 #system-design#idempotency#exactly-once
What you'll learn
  • Explain why exactly-once delivery is at-least-once plus deduplication
  • Design idempotency keys and deduplication windows
  • Sketch an idempotency-key middleware in Express with Redis

Every queue and worker in this section came with the same warning: at-least-once delivery means duplicates happen. A worker crashes after doing the work but before acking; a client retries a request that actually succeeded; a network blip hides a 200. The professional response isn’t to chase impossible guarantees — it’s to make duplicates harmless. That property is idempotency, and it’s the safety net under the entire async architecture.

Idempotency, defined

An operation is idempotent if doing it twice has the same effect as doing it once. GET is naturally idempotent (reading twice changes nothing). Setting status = 'shipped' is idempotent. But balance = balance + 100 is not — run it twice and you’ve created $100 out of nothing. Most “create” and “charge” operations are non-idempotent by default, which is exactly where retries bite.

OperationIdempotent?Why
GET /orders/42YesReading changes nothing
PUT status = 'shipped'YesSame end state every time
DELETE /orders/42YesAlready-deleted stays deleted
balance += 100NoEach call adds again
POST /charge $50NoEach call charges again

Why “exactly-once” is a myth

Engineers dream of exactly-once delivery: the message arrives once, no loss, no duplicate. In a distributed system with networks that can drop, delay, and reorder, it’s impossible as a pure delivery guarantee. The reason is the two-generals problem: after a worker does the work, it must tell the broker “done.” If that ack is lost, the broker can’t tell whether the work happened or the worker died — so it must redeliver. Redelivery means a possible duplicate. No amount of cleverness removes that fork.

Idempotency and the Exactly-Once Myth — architecture diagram

So what people call “exactly-once” is really at-least-once delivery + idempotent processing (deduplication). You can’t stop the duplicate from being delivered; you stop it from having a duplicate effect. That reframing is the single most important idea in this lesson.

Idempotency keys

The standard mechanism is the idempotency key: the client generates a unique id (a UUID) for an operation and sends it with the request, typically as an Idempotency-Key header. The server records that key the first time it processes the operation. If the same key arrives again, the server recognizes it, skips re-doing the work, and returns the stored original response. This is exactly how Stripe makes payment retries safe.

The flow has three states per key: never seen (process it, store the result), in progress (a retry arrived mid-flight — reject or wait), and completed (return the cached response without re-running).

Idempotency and the Exactly-Once Myth — architecture diagram

Deduplication windows

You can’t store every key forever — that’s an unbounded table. So dedup is scoped to a window: keys live for, say, 24 hours (Stripe’s window), after which they expire. The window must comfortably exceed the longest realistic retry horizon (client retries, queue redelivery, manual replays). Redis is ideal here because a key with a TTL is a deduplication window — set it once, let it expire.

The JavaScript angle

Here’s an Express + Redis idempotency-key middleware. It uses Redis SET NX (set only if absent) as an atomic “claim this key” so two concurrent retries can’t both slip through, then caches the response for the window.

An idempotency-key middleware in Express + Redis script.js
import { createClient } from 'redis';
const redis = createClient();
await redis.connect();

const WINDOW = 60 * 60 * 24; // 24h: how long a completed result stays replayable
const CLAIM_TTL = 30;        // seconds: a crashed/failed request frees the key fast

function idempotency() {
  return async (req, res, next) => {
    const key = req.header('Idempotency-Key');
    if (!key) return next(); // opt-in; only guard requests that send a key
    const ns = `idem:${key}`;

    // Atomically claim the key. NX => only the FIRST request wins. The SHORT
    // claim TTL means a crash mid-request frees the key in seconds, not a day.
    const claimed = await redis.set(ns, 'in-progress', { NX: true, EX: CLAIM_TTL });

    if (!claimed) {
      const stored = await redis.get(ns);
      if (stored && stored !== 'in-progress') {
        const { status, body } = JSON.parse(stored);
        return res.status(status).json(body); // replay the ORIGINAL response
      }
      return res.status(409).json({ error: 'request already in progress' });
    }

    // First time through: capture the response so retries can replay it.
    const originalJson = res.json.bind(res);
    res.json = (body) => {
      if (res.statusCode >= 200 && res.statusCode < 300) {
        // Success: cache status + body for the full replay window.
        redis.set(ns, JSON.stringify({ status: res.statusCode, body }), { EX: WINDOW });
      } else {
        // Failure: release the claim NOW so a legitimate retry can run.
        redis.del(ns);
      }
      return originalJson(body);
    };
    next();
  };
}

// One line on the routes that must not double-execute:
app.post('/charge', idempotency(), chargeHandler);
▶ Preview: console

Now a client that retries POST /charge after a dropped response sends the same Idempotency-Key, hits the cached result, and is charged exactly once — without any “exactly-once delivery” magic underneath. The same idea protects a BullMQ worker: dedup on a job’s business key (e.g. payment:<id>) before the side effect.

That completes Communication & Messaging: you can move bytes (protocols), shape APIs, push real-time updates, scale sockets, queue and broadcast work, and make every retry safe. Next we shift to keeping systems healthy under load, starting with rate limiting.