Cache Eviction and Stampedes

What Gets Thrown Out, and What Happens on a Miss

Cache Eviction and Stampedes

Eviction policies — LRU, LFU, FIFO, TTL — and the cache stampede / thundering herd problem, with mitigations: locks, request coalescing, stale-while- revalidate, and jittered TTLs.

9 min read Level 3/5 #system-design#caching#eviction
What you'll learn
  • Compare LRU, LFU, FIFO, and TTL eviction policies
  • Explain cache stampede, thundering herd, and dogpile effects
  • Apply coalescing, locks, stale-while-revalidate, and TTL jitter

A cache is finite memory. Once it fills, every new entry forces an old one out — and which one you throw out determines your hit ratio. That decision is the eviction policy. And there’s a second, sneakier problem hiding in eviction: when a popular key gets evicted (or expires), every request that wanted it misses at once and stampedes your database. This lesson covers both — what to evict, and how to survive the miss.

Eviction policies: who gets thrown out

When the cache is full and you need room, you choose a victim. The common policies:

PolicyEvictsGood whenWeakness
LRU (Least Recently Used)The entry untouched longestRecency predicts reuse (most workloads)A one-off scan can flush hot data
LFU (Least Frequently Used)The entry accessed fewest timesPopularity is stable over timeOld-but-once-popular entries linger
FIFO (First In, First Out)The oldest-inserted entryRarely the best choiceIgnores access entirely
TTL (Time To Live)Anything past its expiryData goes stale on a known scheduleNot a sizing policy on its own

LRU is the default for a reason — in most systems, recently-used things are the things you’ll use again, so evicting the least-recently-used entry keeps the hot set resident. LFU wins when popularity is durable (a handful of items are hot for days). TTL isn’t really a sizing policy — it bounds staleness, not memory — but it pairs with the others: entries expire on schedule and get evicted under pressure. Redis, as we’ll see next lesson, offers configurable combinations like allkeys-lru and volatile-lfu.

A tiny LRU, in JavaScript

LRU is worth understanding from the inside, because the elegant implementation leans on a JavaScript quirk: a Map remembers insertion order, and re-setting a key moves it to the end. That gives you LRU in a few lines — no linked list required.

An LRU cache built on Map insertion order script.js
class LRUCache {
  constructor(capacity) {
    this.capacity = capacity;
    this.map = new Map();           // iteration order = least→most recent
  }

  get(key) {
    if (!this.map.has(key)) return undefined;
    const value = this.map.get(key);
    this.map.delete(key);           // remove…
    this.map.set(key, value);       // …re-insert at the end = "most recent"
    return value;
  }

  set(key, value) {
    if (this.map.has(key)) this.map.delete(key);   // refresh position
    this.map.set(key, value);

    if (this.map.size > this.capacity) {
      // The first key in iteration order is the least-recently-used.
      const lru = this.map.keys().next().value;
      this.map.delete(lru);         // evict it
    }
  }
}

const cache = new LRUCache(2);
cache.set('a', 1);
cache.set('b', 2);
cache.get('a');                     // touch 'a' → 'b' is now LRU
cache.set('c', 3);                  // evicts 'b'
console.log(cache.map.has('b'));    // false
▶ Preview: console

Both get and set run in O(1), and the Map’s ordering does all the bookkeeping a textbook LRU would do with a hash map plus a doubly-linked list.

The cache stampede (thundering herd / dogpile)

Now the dangerous part. Picture a hot key — say a homepage feed cached for many thousands of concurrent readers. Its TTL expires. In the very next instant, all of those requests check the cache, all miss, and all run the expensive query to repopulate it — simultaneously. Your database, which was happily serving zero load for this key, is suddenly hit by thousands of identical queries at once.

This is the cache stampede, also called the thundering herd or dogpile effect. A single expiry triggers a flood. It’s most lethal precisely on your most popular keys, and it tends to strike under your highest traffic — exactly when you can least afford it.

Cache Eviction and Stampedes — architecture diagram

Mitigations

Four techniques, often combined:

1. Request coalescing (single-flight). Within one process, let only the first miss do the work; everyone else awaits its result. A promise cache makes this trivial in Node — store the in-flight promise, not just the value, so concurrent callers share one fetch.

Coalesce concurrent misses into one fetch script.js
const inFlight = new Map();      // key → Promise of the value

async function getCoalesced(key, loader) {
  const hit = await redis.get(key);
  if (hit !== null) return JSON.parse(hit);

  // If a fetch for this key is already running, await IT instead of
  // launching a second identical query.
  if (inFlight.has(key)) return inFlight.get(key);

  const promise = (async () => {
    const value = await loader();                 // the one real DB call
    await redis.set(key, JSON.stringify(value), 'EX', 300);
    return value;
  })().finally(() => inFlight.delete(key));

  inFlight.set(key, promise);
  return promise;
}
▶ Preview: console

2. Distributed locks. Coalescing only dedupes within one process; across a fleet, use a short-lived Redis lock so only one instance repopulates a given key while the others wait or serve stale. (We cover Redlock and its caveats later in the track — locks are powerful but have sharp edges.)

3. Stale-while-revalidate. Serve the stale value immediately when it expires, and refresh it in the background. No request ever waits on the recompute, and the herd never forms — readers get a slightly-old answer for a moment while one background task refreshes. This is the most user-friendly mitigation when slightly-stale data is acceptable.

4. Jittered TTLs. Stampedes are worsened when many keys share the same expiry (e.g. you warmed 10,000 entries at deploy time with a flat 300s TTL — they all expire together). Add randomness: ttl = base + random(0, spread). The staggered expiries spread the repopulation load over time instead of bunching it into one spike.

Jittered TTL to de-synchronize expiries script.js
function jitteredTtl(baseSeconds, spreadSeconds = 60) {
  // e.g. base 300 + 0..60 → expiries spread across a minute, not all at once.
  return baseSeconds + Math.floor(Math.random() * spreadSeconds);
}

await redis.set(key, JSON.stringify(value), 'EX', jitteredTtl(300));
▶ Preview: console

You now understand caching end-to-end: strategies, placement, eviction, and the stampede. The cache we keep reaching for is Redis — so next we go deep on Redis itself, through the eyes of a Node engineer using ioredis.