Performance Tuning

Apply koa-compress for gzip/br output, avoid blocking the event loop with sync work, and use autocannon to measure throughput before and after changes.

4 min read Level 2/5 #koa#production#performance

What you'll learn

Enable response compression with koa-compress
Identify and eliminate synchronous work that blocks the event loop
Run an autocannon benchmark to measure requests per second

Koa’s minimal core means performance is largely determined by what you add to it. A few targeted changes — compression, async I/O discipline, and lean middleware — can double throughput without architectural changes.

Response Compression

koa-compress adds Content-Encoding: gzip or br negotiation in a single middleware:

npm i koa-compress

import compress from "koa-compress";
import { constants } from "node:zlib";

app.use(
  compress({
    threshold: 1024, // skip compression for responses smaller than 1 KB
    gzip: { flush: constants.Z_SYNC_FLUSH },
    br: { params: { [constants.BROTLI_PARAM_QUALITY]: 4 } },
  })
);

Avoid Synchronous Work

Node.js is single-threaded. Any synchronous CPU work blocks the event loop and stalls all concurrent requests.

Avoid	Use instead
`JSON.parse(bigString)` in hot path	Stream parsing, or move to worker
`fs.readFileSync`	`fs.promises.readFile`
`crypto.pbkdf2Sync`	`crypto.pbkdf2` (promisified)
`bcrypt.hashSync`	`bcrypt.hash` (async)

If you must do CPU-heavy work, offload it to a node:worker_threads worker so the event loop stays free.

Keep the Middleware Stack Lean

Every middleware function runs on every request. Audit your stack:

// Print the middleware count to catch accidental bloat
console.log("Middleware count:", app.middleware.length);

Mount route-specific middleware on the router, not the app.
Parse the body only on routes that need it — avoid a global body parser.
Move infrequent tasks (e.g., static file serving) behind a CDN instead.

Benchmarking with Autocannon

Install autocannon globally or as a dev dependency:

npm i -D autocannon

Run a 10-second benchmark with 10 concurrent connections:

npx autocannon -c 10 -d 10 http://localhost:3000/ping

Sample output:

┌─────────┬──────┬──────┬───────┬──────┬─────────┐
│ Stat    │ 2.5% │ 50%  │ 97.5% │ 99%  │ Avg     │
│ Latency │ 0 ms │ 0 ms │ 1 ms  │ 1 ms │ 0.21 ms │
├─────────┼──────┴──────┴───────┴──────┼─────────┤
│ Req/sec │ 28 000 – 42 000            │ 36 482  │
└─────────┴────────────────────────────┴─────────┘

Run the benchmark before and after each change to confirm the improvement is real. A 5–10% variance between runs is normal.

Up Next

Distribute load across all CPU cores using node:cluster and app.callback().

Clustering with node:cluster →