Replication

Copy the Data So a Dead Node Isn't a Dead System

Replication

Leader-follower, multi-leader, and leaderless replication; sync vs async; read replicas, replication lag, and the read-your-own-writes pitfall.

9 min read Level 3/5 #system-design#databases#replication
What you'll learn
  • Compare leader-follower, multi-leader, and leaderless replication
  • Reason about synchronous vs asynchronous tradeoffs
  • Avoid replication-lag bugs when routing reads to replicas

Replication is keeping a copy of your data on multiple machines. You do it for two reasons: availability (a node dies, another takes over) and read throughput (spread reads across copies). It sounds simple — “just copy the writes” — but when and how those copies stay in sync is where every distributed-data headache begins.

Leader-follower (the default)

The most common arrangement: one leader (primary) accepts all writes; followers (replicas) copy the leader’s change stream and serve reads. This is the default in Postgres, MySQL, MongoDB, and most relational setups.

Replication — architecture diagram

It works because reads usually vastly outnumber writes (recall the ~10:1 ratio from the estimation lesson) — so funneling writes to one node while fanning reads across many is a great fit. The weakness: the leader is a single write bottleneck and a single point of failure for writes until failover promotes a follower.

Synchronous vs asynchronous

The pivotal choice is when the leader considers a write done.

  • Synchronous — the leader waits for a follower to confirm before acknowledging the client. The follower is guaranteed current, but the write is as slow as the slowest replica, and if that replica stalls, writes stall.
  • Asynchronous — the leader acknowledges immediately and ships changes to followers in the background. Fast and available, but followers lag, and a leader crash can lose the not-yet-replicated tail of writes.

Most systems run semi-synchronous: one synchronous follower (so a confirmed write survives a leader crash) and the rest asynchronous (for speed).

SynchronousAsynchronous
Write latencyslow (waits for replica)fast (fire and forget)
Durability on crashstrongmay lose recent writes
Replica freshnessup to datelags behind
Availability under replica failurewrites blockwrites continue

Multi-leader and leaderless

When one leader isn’t enough, two other topologies appear:

Multi-leader — several nodes accept writes (often one per region, so each region writes locally with low latency). The hard part is conflict resolution: two leaders can accept conflicting writes to the same row, and you need a rule (last-write-wins, version vectors, or app-level merge) to reconcile.

Leaderless — every replica accepts reads and writes; the client (or a coordinator) writes to several nodes and reads from several, using quorums to stay consistent (Dynamo, Cassandra). With N replicas, if writes hit W nodes and reads hit R nodes and W + R > N, any read overlaps at least one node with the latest write. We revisit quorums in the fault-tolerance lessons.

Replication lag and the bug it causes

With asynchronous replication, followers are eventually consistent — they catch up, but not instantly. That lag is usually milliseconds, but it produces a classic, confusing bug: a user makes a write, immediately reads, and the read hits a follower that hasn’t caught up — so their own change appears to vanish.

Replication — architecture diagram

This is the read-your-own-writes problem, and it’s the single most common way naive read-replica routing burns people.

Failover

When the leader dies, a follower must be promoted. Done well, it’s automatic (via a consensus protocol like Raft) and takes seconds. Done badly, it causes:

  • Lost writes — an async leader’s un-replicated tail is gone on promotion.
  • Split-brain — the old leader revives and two nodes think they’re leader, accepting divergent writes. Fencing (a token only the current leader holds) prevents this.

The JavaScript angle

In Node you usually run two connection pools — one to the leader for writes, one to the followers for reads — and the read-your-own-writes trap is yours to avoid in app code:

Route the just-wrote user back to the leader script.js
import { Pool } from 'pg';
const leader = new Pool({ host: 'db-leader' });
const replica = new Pool({ host: 'db-replica' });

// Naive: every read goes to the replica → user can't see their own write.
async function getProfile(userId) {
  return replica.query('SELECT * FROM profiles WHERE id = $1', [userId]);
}

// Fix: for a short window after a user's write, read from the leader.
const recentlyWrote = new Map(); // userId -> expiry timestamp

async function updateProfile(userId, data) {
  await leader.query('UPDATE profiles SET name = $1 WHERE id = $2', [data.name, userId]);
  recentlyWrote.set(userId, Date.now() + 2000); // pin to leader for 2s
}

async function getProfileSafe(userId) {
  const pinned = (recentlyWrote.get(userId) ?? 0) > Date.now();
  const pool = pinned ? leader : replica;   // read your own writes
  return pool.query('SELECT * FROM profiles WHERE id = $1', [userId]);
}
▶ Preview: console

Replication gives you more copies of the same data. The next scaling move is splitting different data across machines so no single node holds it all — sharding.