The Hard Part of WebSockets Isn't Opening Them, It's Scaling Them
WebSockets at Scale in Node
Why stateful socket connections break naive load balancing, how sticky sessions help, and how a Redis pub/sub adapter fans messages out across instances.
What you'll learn
- Explain why in-memory socket state defeats horizontal scaling
- Apply sticky sessions to keep a client pinned to one instance
- Use a Redis pub/sub adapter to fan messages out across all instances
WebSockets are easy to start and hard to scale — and the reason is the most important sentence in this whole section: a WebSocket connection lives in the memory of one specific server process. Everything difficult about real-time at scale flows from that single fact. This is the flagship Node lesson; we’ll build the problem up and then solve it.
The in-memory connection problem
A single Node process can comfortably hold tens of thousands of open sockets. When a client connects, you keep a reference to its socket in memory so you can push to it later:
import { WebSocketServer } from 'ws';
const wss = new WebSocketServer({ port: 8080 });
const clients = new Set(); // ⚠️ lives in THIS process's memory only
wss.on('connection', (socket) => {
clients.add(socket);
socket.on('message', (data) => {
// Broadcast to everyone... connected to THIS instance.
for (const c of clients) c.send(data);
});
socket.on('close', () => clients.delete(socket));
}); This is correct on one server and silently broken on two. If Alice’s socket
lives on instance A and Bob’s lives on instance B, Alice’s message broadcasts to
the clients set on A — which doesn’t contain Bob. The room is split across
processes that can’t see each other’s connections.
Why you can’t naively load-balance stateful sockets
Stateless HTTP scales horizontally because any server can handle any request — the load balancer sprays requests across the fleet and nobody cares which box answers. WebSockets break that assumption twice:
- The connection is pinned. Once the
Upgradehandshake completes, that client is bound to that one process for the life of the socket. The LB can’t move an open connection to a less-busy box. - The state is local. As we just saw, the set of who’s-connected-where lives in per-process memory. No single instance has the full picture.
So we have two distinct problems: (1) getting a client’s handshake to land on a server that can keep it, and (2) getting a message from a socket on A out to sockets on B and C. They have two different solutions.
Problem 1: sticky sessions
The handshake problem is solved at the load balancer with sticky sessions (aka
session affinity): the LB hashes the client (by IP or a cookie) so every request
from that client — including the long-lived Upgrade — routes to the same
instance. This matters even with a single LB because libraries like Socket.IO may
make a couple of HTTP polling requests during the initial handshake before
upgrading to WebSocket; if those land on different instances, the handshake fails.
Problem 2: fan-out with a Redis pub/sub adapter
To get a message from a socket on instance A to sockets on B and C, you need a shared message bus the instances all subscribe to. Redis pub/sub is the canonical choice, and Socket.IO ships an adapter that wires it up for you.
The idea: when any instance emits to a room, it publishes that emit to Redis; every instance is subscribed, receives the message, and re-emits it to the matching local sockets. No instance needs to know where any socket physically lives — Redis is the meeting point.
In Socket.IO this is a few lines: attach the Redis adapter and keep writing code as if you had one server. The adapter intercepts every cross-instance emit.
import { createServer } from 'node:http';
import { Server } from 'socket.io';
import { createAdapter } from '@socket.io/redis-adapter';
import { createClient } from 'redis';
const httpServer = createServer();
const io = new Server(httpServer);
// One pub + one sub Redis connection, shared by every app instance.
const pubClient = createClient({ url: 'redis://localhost:6379' });
const subClient = pubClient.duplicate();
await Promise.all([pubClient.connect(), subClient.connect()]);
// This is the whole fix: emits now fan out across ALL instances.
io.adapter(createAdapter(pubClient, subClient));
io.on('connection', (socket) => {
socket.on('join', (room) => socket.join(room));
socket.on('chat', (room, msg) => {
// Reaches every socket in `room`, even ones on OTHER instances.
io.to(room).emit('chat', msg);
});
});
httpServer.listen(3000); The payoff: your application code is identical to the single-server version —
io.to(room).emit(...) — but it now correctly reaches Bob on instance B. The
adapter handles the publish/subscribe plumbing; you handle the chat logic.
Putting it together
A production WebSocket tier therefore needs three pieces working in concert:
| Piece | Problem it solves | Typical tool |
|---|---|---|
| Sticky sessions | Keep a client’s connection on one instance | LB affinity (cookie/IP hash) |
| Redis pub/sub adapter | Fan a message out to sockets on all instances | @socket.io/redis-adapter |
| Horizontal app tier | Hold more concurrent sockets | N stateless Node instances |
With those three, you scale real-time the same way you scale anything else: add
instances behind the LB. The sockets stay sticky, Redis glues the instances
together, and emit Just Works fleet-wide.
We solved live fan-out, but we explicitly skipped durability — what if the recipient is offline, or the work triggered by a message must not be lost? That’s the job of message queues, next.