Streaming Responses

Send Data As It's Ready — Don't Buffer

Streaming Responses

Stream large responses chunk by chunk, with constant memory and faster TTFB.

3 min read Level 2/5 #nodejs#http#streaming
What you'll learn
  • Stream a file response
  • Send Server-Sent Events
  • Implement chunked JSON

res.end(body) buffers the whole response in memory. For big or long-lived responses, stream chunks as they’re ready.

Streaming a File

import { createReadStream } from "node:fs";
import { pipeline } from "node:stream/promises";

createServer(async (req, res) => {
  res.setHeader("content-type", "video/mp4");
  await pipeline(createReadStream("./big.mp4"), res);
}).listen(3000);

Constant memory regardless of file size. Faster time-to-first-byte since the browser starts receiving immediately.

Manual Chunks

createServer((req, res) => {
  res.writeHead(200, { "content-type": "text/plain" });

  let n = 0;
  const id = setInterval(() => {
    res.write(`tick ${n++}\n`);
    if (n >= 10) {
      clearInterval(id);
      res.end();
    }
  }, 500);
}).listen(3000);

res.write(chunk) sends without ending. res.end() closes.

Server-Sent Events (SSE)

A persistent connection that pushes messages to the browser:

createServer((req, res) => {
  if (req.url !== "/events") return res.end();

  res.writeHead(200, {
    "content-type":  "text/event-stream",
    "cache-control": "no-cache",
    "connection":    "keep-alive",
  });

  const id = setInterval(() => {
    res.write(`data: ${JSON.stringify({ time: Date.now() })}\n\n`);
  }, 1000);

  req.on("close", () => clearInterval(id));
}).listen(3000);

On the browser:

const es = new EventSource("/events");
es.onmessage = (e) => console.log(JSON.parse(e.data));

SSE is simpler than WebSockets when you only need server → client push. Good for stock tickers, live logs, AI streaming.

Streaming AI Responses

The pattern most chat UIs use now:

async function streamChat(prompt, res) {
  res.writeHead(200, { "content-type": "text/event-stream" });
  for await (const chunk of callLLM(prompt)) {
    res.write(`data: ${JSON.stringify({ delta: chunk })}\n\n`);
  }
  res.write("data: [DONE]\n\n");
  res.end();
}

The user sees tokens as they’re generated, not after the whole answer is ready.

HTTP/2 →