Generate Billions of IDs Without a Central Bottleneck
Unique IDs at Scale
UUID v4 vs v7, Snowflake's timestamp-machine-sequence layout, and ULID — plus why sortability and index locality decide which one you want.
What you'll learn
- Compare UUID v4, UUID v7, Snowflake, and ULID
- Explain why sortable IDs improve index locality
- Implement a Snowflake-style generator in JavaScript
Every row, message, and event needs a unique identifier — and at scale you can’t just use a database auto-increment, because that forces every insert through a single coordinating node. You want IDs that many machines can mint independently, without coordinating, while still being unique. That’s the core problem, and the solutions trade off along a couple of axes that turn out to matter a lot for performance.
The two axes that matter
- Uniqueness without coordination — can a node generate an ID alone, without asking a central service? (Latency and availability both depend on this.)
- Sortability / index locality — do IDs created near in time sort near each other? This sounds cosmetic but it’s a real performance lever, explained below.
The contenders
| Scheme | Bits | Coordination | Time-sortable | Notes |
|---|---|---|---|---|
| Auto-increment | varies | Central DB | Yes | Simple, but a write bottleneck |
| UUID v4 | 128 | None (random) | No | Ubiquitous, but random hurts indexes |
| UUID v7 | 128 | None | Yes | Timestamp-prefixed — v4’s successor |
| Snowflake | 64 | None (per-machine) | Yes | Compact, sortable, needs machine IDs |
| ULID | 128 | None | Yes | Timestamp + random, Crockford base32 |
- UUID v4 is 122 random bits. Collisions are astronomically unlikely and you need zero coordination — which is why it’s everywhere. Its flaw is purely that it’s random, with consequences for your indexes (next section).
- UUID v7 keeps UUID’s 128-bit format but puts a millisecond timestamp in the high bits, so IDs minted close in time sort close together. It’s the modern default — UUID’s convenience, plus sortability.
- Snowflake (from Twitter) packs everything into a compact 64-bit integer: a timestamp, a machine ID, and a per-millisecond sequence. Small and sortable, at the cost of needing to hand each generator a unique machine ID.
- ULID is 128 bits like a UUID but lexicographically sortable, encoded in human-friendlier Crockford base32.
Why sortability matters: index locality
Here’s the non-obvious payoff. Most databases store the primary key in a B-tree, kept in sorted order. When you insert a random key (UUID v4), it lands at a random spot in the tree — so consecutive inserts scatter across the whole index, touching random pages, thrashing the cache, and fragmenting the structure (lots of page splits).
A time-sortable key (UUID v7, Snowflake, ULID) always inserts at the right edge of the tree, because new IDs are always larger than old ones. Inserts hit the same hot pages, the cache stays warm, and there’s almost no fragmentation.
The JavaScript angle: a Snowflake generator
Snowflake is the most instructive to build because you can see every bit doing a job. The 64-bit layout: 1 unused sign bit, 41 bits of millisecond timestamp (since a custom epoch), 10 bits of machine ID (1,024 machines), and 12 bits of sequence (4,096 IDs per machine per millisecond).
const EPOCH = 1_700_000_000_000n; // custom epoch — extends usable range
const MACHINE_BITS = 10n;
const SEQ_BITS = 12n;
const MAX_SEQ = (1n << SEQ_BITS) - 1n; // 4095
class Snowflake {
constructor(machineId) {
if (machineId < 0 || machineId > 1023) throw new Error('machineId 0..1023');
this.machineId = BigInt(machineId);
this.lastMs = -1n;
this.seq = 0n;
}
next() {
let now = BigInt(Date.now());
if (now === this.lastMs) {
this.seq = (this.seq + 1n) & MAX_SEQ;
if (this.seq === 0n) {
// Ran out of sequence this ms — spin until the next millisecond.
while (BigInt(Date.now()) <= this.lastMs) {}
now = BigInt(Date.now());
}
} else {
this.seq = 0n;
}
this.lastMs = now;
// Shift each field into its slot and OR them together.
return ((now - EPOCH) << (MACHINE_BITS + SEQ_BITS)) |
(this.machineId << SEQ_BITS) |
this.seq;
}
}
const gen = new Snowflake(7);
console.log(gen.next().toString()); // e.g. "29874...": sortable, unique, 64-bit Three things to notice. BigInt is essential — a 64-bit ID overflows
JavaScript’s safe-integer range (53 bits), so a plain number would silently
corrupt the low bits. The sequence lets one machine mint 4,096 IDs in a
single millisecond before it has to wait. And the machine ID is what removes
coordination — give each generator a distinct one (from config, etcd, or a
ZooKeeper sequence) and they never collide.
In practice: reach for UUID v7 by default (sortable, zero setup), Snowflake when 64-bit compactness matters at huge scale, and UUID v4 when you must hide timing. With IDs in hand, the next building block is finding things by their content, not their key — full-text search.