Saturate Every CPU Core via app.callback() and the Cluster API
Clustering with node:cluster
Use Node's built-in cluster module to fork one worker per CPU core, passing app.callback() to each worker's HTTP server for zero-code-change scaling.
What you'll learn
- Fork worker processes with node:cluster and os.availableParallelism
- Pass app.callback() to http.createServer in each worker
- Understand the sticky-session caveat for stateful WebSocket or session workloads
Node.js runs in a single thread. A Koa server on a machine with 8 cores uses
only one of them unless you explicitly fork workers. The built-in node:cluster
module lets you do that without changing your Koa application code at all —
because app.callback() returns a plain HTTP handler that any server can use.
The Cluster Pattern
// cluster.js
import cluster from "node:cluster";
import { availableParallelism } from "node:os";
import { createServer } from "node:http";
import app from "./app.js"; // your Koa app
const numCPUs = availableParallelism();
if (cluster.isPrimary) {
console.log(`Primary ${process.pid} is running`);
for (let i = 0; i < numCPUs; i++) {
cluster.fork();
}
cluster.on("exit", (worker, code) => {
console.warn(`Worker ${worker.process.pid} exited (code ${code}), respawning`);
cluster.fork();
});
} else {
// Each worker runs independently
createServer(app.callback()).listen(3000, () => {
console.log(`Worker ${process.pid} listening on :3000`);
});
} The OS distributes incoming connections across workers. No changes to app.js
are required — app.callback() is the integration point.
Using PM2 Instead
PM2 provides the same multi-core distribution with a simpler CLI and a built-in process dashboard:
npm i -g pm2
pm2 start server.js -i max # one worker per CPU
pm2 monit # live dashboard
pm2 save && pm2 startup # survive reboots -i max tells PM2 to fork as many workers as there are CPU cores. PM2 also
handles respawning crashed workers automatically.
The Sticky-Session Caveat
The cluster module distributes connections round-robin across workers. This is fine for stateless REST APIs. It breaks for:
- Server-side sessions stored in memory — the second request may hit a different worker that has no session data.
- WebSockets — the handshake and subsequent frames must reach the same worker.
Solutions:
| Problem | Solution |
|---|---|
| Memory sessions | Move sessions to Redis (use koa-session + Redis store) |
| WebSockets | Use sticky routing at the load balancer (NGINX ip_hash) |
| Shared state | Use a shared store (Redis, Postgres) instead of worker memory |
For most JSON API workloads none of these apply, and cluster is a free throughput multiplier.
Up Next
Configure a Koa app for production deployment — environment variables, reverse proxy trust, and zero-downtime restarts.
Production Deployment →