Back-of-the-Envelope Estimation

Order-of-Magnitude Math Decides the Architecture

Back-of-the-Envelope Estimation

Estimating QPS, storage, and bandwidth from a user count — the quick math that tells you whether one database is enough or you need fifty.

9 min read Level 3/5 #system-design#estimation#capacity-planning
What you'll learn
  • Convert DAU into requests per second
  • Estimate storage and bandwidth per year
  • Use powers of two and ten to reason about scale quickly

Back-of-the-envelope estimation is the math that turns “design Twitter” into “we need roughly 12,000 writes per second and 35 terabytes of new data a year.” You’re not computing exact numbers — you’re finding the order of magnitude that decides the shape of the system. One server or a thousand? One database or a sharded fleet? The estimate answers that.

The numbers worth memorizing

QuantityValueRounded
Seconds in a day86,400~10⁵ (100K)
Seconds in a month2,592,000~2.5 × 10⁶
KB10³ bytesthousand
MB10⁶ bytesmillion
GB10⁹ bytesbillion
TB10¹² bytestrillion
Char (ASCII)1 byte
Char (typical UTF-8)1–3 bytes

The single most useful trick: ~100,000 seconds in a day. It means

requests per second (QPS) ≈ daily requests ÷ 100,000

So 1 billion requests/day ≈ 10,000 QPS. You can do that division in your head.

Worked example: a Twitter-like service

Let’s estimate from a single starting number — 200 million daily active users — and a few reasonable assumptions.

Step 1 — Write QPS. Say each user posts 2 tweets/day on average:

writes/day = 200M users × 2 = 400M tweets/day
write QPS  = 400M ÷ 100,000 ≈ 4,000 writes/sec  (average)

Traffic isn’t flat, so multiply by a peak factor (~3×):

peak write QPS ≈ 12,000 writes/sec

Step 2 — Read QPS. Reads dominate social systems. If each user loads their timeline ~20 times/day:

reads/day = 200M × 20 = 4B reads/day
read QPS  = 4B ÷ 100,000 ≈ 40,000 reads/sec  (average)
peak      ≈ 120,000 reads/sec

The read:write ratio is ~10:1 — that single fact justifies caching and read replicas later.

Step 3 — Storage per year. Say a stored tweet (text + metadata) is ~300 bytes:

storage/day  = 400M tweets × 300 bytes = 120 GB/day
storage/year = 120 GB × 365 ≈ 44 TB/year  (text only)

Add media (images/video) and that number explodes — which tells you media needs a separate object-storage path from text. The estimate surfaced an architectural decision.

The same math, as a tiny estimator script.js
const SECONDS_PER_DAY = 86_400;
const PEAK_FACTOR = 3;

function estimate({ dau, actionsPerUserPerDay, bytesPerAction }) {
  const perDay = dau * actionsPerUserPerDay;
  const avgQps = perDay / SECONDS_PER_DAY;
  const storagePerYearTB = (perDay * bytesPerAction * 365) / 1e12;
  return {
    avgQps: Math.round(avgQps),
    peakQps: Math.round(avgQps * PEAK_FACTOR),
    storagePerYearTB: +storagePerYearTB.toFixed(1),
  };
}

console.log(estimate({ dau: 200e6, actionsPerUserPerDay: 2, bytesPerAction: 300 }));
// { avgQps: 4630, peakQps: 13889, storagePerYearTB: 43.8 }
▶ Preview: console

Bandwidth, the often-forgotten axis

Bandwidth = QPS × payload size. A timeline read returning 50 tweets at ~300 bytes each is ~15KB:

read bandwidth = 40,000 reads/sec × 15 KB ≈ 600 MB/sec ≈ 4.8 Gbps

That’s already enough to justify a CDN and aggressive caching for read paths.

What the estimate tells you

Each number maps to a design decision:

EstimateImplication
Peak write QPS ~12KOne DB primary can’t absorb it → shard or queue writes
Read:write ~10:1Add caching + read replicas
44 TB/year textPlan partitioning and retention from day one
4.8 Gbps read bandwidthCDN + edge caching for the read path
Media >> textSeparate object-storage path (S3-style)

Next: the latency numbers that tell you how long each piece of that request actually takes.