Cluster Mode

Use all the cores on a machine by forking Node processes with the built-in cluster module — or PM2.

3 min read Level 2/5 #express#cluster#scaling

What you'll learn

Cluster a server across cores
Use PM2 for the easy path
Know when containers replace this

One Node process uses one CPU core. To use all the cores on a machine, you run multiple processes — cluster (built-in) or PM2 (easier).

Manual `cluster`

import cluster from "node:cluster";
import { cpus } from "node:os";
import { buildApp } from "./app.js";

if (cluster.isPrimary) {
  const n = cpus().length;
  console.log(`primary spawning ${n} workers`);
  for (let i = 0; i < n; i++) cluster.fork();

  cluster.on("exit", (worker) => {
    console.log(`worker ${worker.process.pid} died — restarting`);
    cluster.fork();
  });
} else {
  const app = buildApp();
  app.listen(3000, () => console.log(`worker ${process.pid} ready`));
}

All workers bind to port 3000. The OS round-robins connections. Each handles requests on its own event loop.

PM2 (Production-Friendly)

npm install -g pm2

pm2 start src/index.js -i max --name my-api
pm2 logs my-api
pm2 reload my-api    # zero-downtime restart
pm2 monit            # live dashboard

-i max = one worker per core. PM2 also handles auto-restart, log rotation, zero-downtime reloads.

Shared State

Workers share nothing — separate processes, separate event loops. State that needs to be shared:

Sessions — store in Redis, not in memory
Rate-limit counters — Redis-backed limiter
Caches — per-worker LRU is fine if cache misses are cheap; otherwise Redis

The in-memory shortcut that worked for a single process becomes a bug across cluster workers.

Modern Alternative — Containers

For real production, the common pattern is:

4 instances × 1 Node process each = 4 processes total

instead of:

1 instance × 4 cluster workers = 4 processes

Containers give you isolated failures, easy horizontal scaling across machines, and rolling deploys. Each container runs one Node process — keeping cluster out of the picture.

PM2’s reload feature is trying to do what containers do natively. For new apps targeting Kubernetes / Fargate / Cloud Run: skip cluster, run one process per container, scale containers.

When Cluster Is Right

Single VPS or bare-metal server
Mostly CPU-bound (image processing, etc.)
No orchestrator above you

For everything else: containers. Less complexity, more capability.

Behind A Reverse Proxy →