Unique IDs at Scale

Generate Billions of IDs Without a Central Bottleneck

Unique IDs at Scale

UUID v4 vs v7, Snowflake's timestamp-machine-sequence layout, and ULID — plus why sortability and index locality decide which one you want.

9 min read Level 3/5 #system-design#unique-ids#snowflake
What you'll learn
  • Compare UUID v4, UUID v7, Snowflake, and ULID
  • Explain why sortable IDs improve index locality
  • Implement a Snowflake-style generator in JavaScript

Every row, message, and event needs a unique identifier — and at scale you can’t just use a database auto-increment, because that forces every insert through a single coordinating node. You want IDs that many machines can mint independently, without coordinating, while still being unique. That’s the core problem, and the solutions trade off along a couple of axes that turn out to matter a lot for performance.

The two axes that matter

  1. Uniqueness without coordination — can a node generate an ID alone, without asking a central service? (Latency and availability both depend on this.)
  2. Sortability / index locality — do IDs created near in time sort near each other? This sounds cosmetic but it’s a real performance lever, explained below.

The contenders

SchemeBitsCoordinationTime-sortableNotes
Auto-incrementvariesCentral DBYesSimple, but a write bottleneck
UUID v4128None (random)NoUbiquitous, but random hurts indexes
UUID v7128NoneYesTimestamp-prefixed — v4’s successor
Snowflake64None (per-machine)YesCompact, sortable, needs machine IDs
ULID128NoneYesTimestamp + random, Crockford base32
  • UUID v4 is 122 random bits. Collisions are astronomically unlikely and you need zero coordination — which is why it’s everywhere. Its flaw is purely that it’s random, with consequences for your indexes (next section).
  • UUID v7 keeps UUID’s 128-bit format but puts a millisecond timestamp in the high bits, so IDs minted close in time sort close together. It’s the modern default — UUID’s convenience, plus sortability.
  • Snowflake (from Twitter) packs everything into a compact 64-bit integer: a timestamp, a machine ID, and a per-millisecond sequence. Small and sortable, at the cost of needing to hand each generator a unique machine ID.
  • ULID is 128 bits like a UUID but lexicographically sortable, encoded in human-friendlier Crockford base32.

Why sortability matters: index locality

Here’s the non-obvious payoff. Most databases store the primary key in a B-tree, kept in sorted order. When you insert a random key (UUID v4), it lands at a random spot in the tree — so consecutive inserts scatter across the whole index, touching random pages, thrashing the cache, and fragmenting the structure (lots of page splits).

A time-sortable key (UUID v7, Snowflake, ULID) always inserts at the right edge of the tree, because new IDs are always larger than old ones. Inserts hit the same hot pages, the cache stays warm, and there’s almost no fragmentation.

The JavaScript angle: a Snowflake generator

Snowflake is the most instructive to build because you can see every bit doing a job. The 64-bit layout: 1 unused sign bit, 41 bits of millisecond timestamp (since a custom epoch), 10 bits of machine ID (1,024 machines), and 12 bits of sequence (4,096 IDs per machine per millisecond).

A Snowflake-style ID generator (BigInt) script.js
const EPOCH = 1_700_000_000_000n;   // custom epoch — extends usable range
const MACHINE_BITS = 10n;
const SEQ_BITS = 12n;
const MAX_SEQ = (1n << SEQ_BITS) - 1n; // 4095

class Snowflake {
  constructor(machineId) {
    if (machineId < 0 || machineId > 1023) throw new Error('machineId 0..1023');
    this.machineId = BigInt(machineId);
    this.lastMs = -1n;
    this.seq = 0n;
  }

  next() {
    let now = BigInt(Date.now());
    if (now === this.lastMs) {
      this.seq = (this.seq + 1n) & MAX_SEQ;
      if (this.seq === 0n) {
        // Ran out of sequence this ms — spin until the next millisecond.
        while (BigInt(Date.now()) <= this.lastMs) {}
        now = BigInt(Date.now());
      }
    } else {
      this.seq = 0n;
    }
    this.lastMs = now;

    // Shift each field into its slot and OR them together.
    return ((now - EPOCH) << (MACHINE_BITS + SEQ_BITS)) |
           (this.machineId << SEQ_BITS) |
           this.seq;
  }
}

const gen = new Snowflake(7);
console.log(gen.next().toString()); // e.g. "29874...": sortable, unique, 64-bit
▶ Preview: console

Three things to notice. BigInt is essential — a 64-bit ID overflows JavaScript’s safe-integer range (53 bits), so a plain number would silently corrupt the low bits. The sequence lets one machine mint 4,096 IDs in a single millisecond before it has to wait. And the machine ID is what removes coordination — give each generator a distinct one (from config, etcd, or a ZooKeeper sequence) and they never collide.

In practice: reach for UUID v7 by default (sortable, zero setup), Snowflake when 64-bit compactness matters at huge scale, and UUID v4 when you must hide timing. With IDs in hand, the next building block is finding things by their content, not their key — full-text search.