Clustering with node:cluster

Use Node's built-in cluster module to fork one worker per CPU core, passing app.callback() to each worker's HTTP server for zero-code-change scaling.

4 min read Level 3/5 #koa#production#cluster

What you'll learn

Fork worker processes with node:cluster and os.availableParallelism
Pass app.callback() to http.createServer in each worker
Understand the sticky-session caveat for stateful WebSocket or session workloads

Node.js runs in a single thread. A Koa server on a machine with 8 cores uses only one of them unless you explicitly fork workers. The built-in node:cluster module lets you do that without changing your Koa application code at all — because app.callback() returns a plain HTTP handler that any server can use.

The Cluster Pattern

// cluster.js
import cluster from "node:cluster";
import { availableParallelism } from "node:os";
import { createServer } from "node:http";
import app from "./app.js"; // your Koa app

const numCPUs = availableParallelism();

if (cluster.isPrimary) {
  console.log(`Primary ${process.pid} is running`);

  for (let i = 0; i < numCPUs; i++) {
    cluster.fork();
  }

  cluster.on("exit", (worker, code) => {
    console.warn(`Worker ${worker.process.pid} exited (code ${code}), respawning`);
    cluster.fork();
  });
} else {
  // Each worker runs independently
  createServer(app.callback()).listen(3000, () => {
    console.log(`Worker ${process.pid} listening on :3000`);
  });
}

The OS distributes incoming connections across workers. No changes to app.js are required — app.callback() is the integration point.

Using PM2 Instead

PM2 provides the same multi-core distribution with a simpler CLI and a built-in process dashboard:

npm i -g pm2
pm2 start server.js -i max   # one worker per CPU
pm2 monit                    # live dashboard
pm2 save && pm2 startup      # survive reboots

-i max tells PM2 to fork as many workers as there are CPU cores. PM2 also handles respawning crashed workers automatically.

The Sticky-Session Caveat

The cluster module distributes connections round-robin across workers. This is fine for stateless REST APIs. It breaks for:

Server-side sessions stored in memory — the second request may hit a different worker that has no session data.
WebSockets — the handshake and subsequent frames must reach the same worker.

Solutions:

Problem	Solution
Memory sessions	Move sessions to Redis (use `koa-session` + Redis store)
WebSockets	Use sticky routing at the load balancer (NGINX `ip_hash`)
Shared state	Use a shared store (Redis, Postgres) instead of worker memory

For most JSON API workloads none of these apply, and cluster is a free throughput multiplier.

Up Next

Configure a Koa app for production deployment — environment variables, reverse proxy trust, and zero-downtime restarts.

Production Deployment →