Add a Node Without Reshuffling the World
Consistent Hashing
The hash ring, virtual nodes, and why consistent hashing moves only a small fraction of keys when nodes join or leave — with a working JS implementation.
What you'll learn
- Explain the hash ring and how keys map to nodes
- Describe why consistent hashing minimizes reshuffling
- Implement a consistent-hash ring with virtual nodes in JS
Last lesson ended on a time bomb: shard = hash(key) % N. The moment N
changes — you add a node, or one dies — almost every key remaps to a different
shard, triggering a massive, cache-busting reshuffle. Consistent hashing is
the technique that fixes exactly this, and it’s the backbone of Dynamo,
Cassandra, and most distributed caches.
The promise: when you add or remove a node, only about 1/N of the keys move —
not all of them.
The ring
Picture the hash space as a circle — every hash value from 0 to its maximum, wrapped end to end. Now:
- Place each node on the ring by hashing its identifier.
- Place each key on the ring by hashing the key.
- To find a key’s node, walk clockwise from the key until you hit the first node. That node owns the key.
Now watch what happens when a node leaves: only the keys that used to walk to that node move — they continue clockwise to the next node. Every other key is untouched. Adding a node is symmetric: the new node steals only the slice of keys that now land between it and its clockwise neighbor. That’s the whole magic — a node change is a local disruption, not a global one.
| Approach | Keys moved when N changes |
|---|---|
hash(key) % N | nearly all of them |
| Consistent hashing | ~1/N (just the affected arc) |
Virtual nodes
Plain consistent hashing has a flaw: with only a few nodes, they land on the ring unevenly, so one node randomly owns a huge arc and gets overloaded. And when a node dies, all its load dumps onto its single clockwise neighbor.
The fix is virtual nodes: each physical node is placed on the ring many times (say 100–200 positions) under hashed variants of its name. Now each physical node owns many small scattered arcs instead of one big one, so:
- Load is even — the law of large numbers smooths the distribution.
- Failure spreads — a dead node’s arcs scatter across all survivors, not one unlucky neighbor.
- Heterogeneous hardware works — give a bigger machine more virtual nodes and it gets a proportionally larger share.
Implementing it
Here’s a small but complete ring with virtual nodes. The trick is a sorted array of ring positions plus a binary search to find the first position at or after a key’s hash — wrapping to index 0 if we run off the end.
import { createHash } from 'node:crypto';
function hash(str) {
// 32 bits of an md5 digest → a point on the ring [0, 2^32).
return createHash('md5').update(str).digest().readUInt32BE(0);
}
class HashRing {
constructor(vnodes = 150) {
this.vnodes = vnodes;
this.ring = new Map(); // position -> physical node
this.sorted = []; // sorted ring positions
}
addNode(node) {
for (let i = 0; i < this.vnodes; i++) {
this.ring.set(hash(`${node}#${i}`), node);
}
this.sorted = [...this.ring.keys()].sort((a, b) => a - b);
}
removeNode(node) {
for (let i = 0; i < this.vnodes; i++) this.ring.delete(hash(`${node}#${i}`));
this.sorted = [...this.ring.keys()].sort((a, b) => a - b);
}
getNode(key) {
if (this.sorted.length === 0) return null;
const h = hash(key);
// Binary search for the first ring position >= h; wrap to 0 if none.
let lo = 0, hi = this.sorted.length;
while (lo < hi) {
const mid = (lo + hi) >> 1;
if (this.sorted[mid] < h) lo = mid + 1;
else hi = mid;
}
const pos = this.sorted[lo % this.sorted.length]; // wrap around
return this.ring.get(pos);
}
}
const ring = new HashRing();
['cache-a', 'cache-b', 'cache-c'].forEach((n) => ring.addNode(n));
console.log(ring.getNode('user:42')); // e.g. 'cache-b'
// Add a node: only the small arc of keys near it move; the rest stay put.
ring.addNode('cache-d');
console.log(ring.getNode('user:42')); // most keys, including this one, unchanged The JavaScript angle
This isn’t theoretical for Node engineers — it’s running in your stack already. A
Redis cluster uses a related slot-mapping scheme, and a client-side sharded
cache pool is a hash ring. The win is concrete: when you scale a cache from 3
to 4 nodes with a hash ring, roughly 25% of keys miss and refill — annoying but
survivable. With % N, nearly every key misses at once, hammering your
database with a thundering herd the moment you scale up. Consistent hashing is
what makes “add a cache node” a non-event instead of an outage.
We’ve now covered how to spread data across nodes. The instant data lives on multiple machines, a network partition becomes inevitable — and that forces the single most famous tradeoff in distributed systems: CAP.