Decouple Producers From Consumers and Survive the Spikes
Message Queues and Brokers
Kafka vs RabbitMQ vs SQS, at-least-once vs at-most-once delivery, ordering, consumer groups and partitions, and when to introduce a queue at all.
What you'll learn
- Distinguish a log (Kafka) from a broker (RabbitMQ) and a managed queue (SQS)
- Reason about delivery guarantees, ordering, and consumer groups
- Decide when adding a queue actually pays off
A message queue sits between a producer and a consumer so they don’t have to be online, fast, or even scaled together at the same time. The producer drops a message and moves on; a consumer picks it up whenever it’s ready. That single indirection buys you decoupling, buffering against spikes, and resilience — if the consumer is down, messages wait instead of being lost.
When to introduce a queue
Don’t add a queue reflexively — it’s a new piece of infrastructure to operate. Reach for one when you see:
- Slow work on the request path — sending email, transcoding video, charging a card. Push it off the synchronous path so the user isn’t waiting.
- Spiky load — a flash sale produces 50× normal writes for ten minutes. A queue absorbs the spike and lets consumers drain at a steady rate (load leveling), instead of crushing the database.
- Fan-out to many consumers — one event (“order placed”) needs to trigger billing, shipping, and analytics independently.
- Crossing a reliability boundary — you want a retryable buffer between two services that fail independently.
Two mental models: log vs broker
The biggest conceptual split in this space is log versus broker.
A broker (RabbitMQ, SQS) treats messages as transient work items. A message is delivered to a consumer, acknowledged, and deleted. The broker actively routes and tracks each message’s state. Think of it as a smart to-do list that hands out tasks and crosses them off.
A log (Kafka) is an append-only, ordered, durable sequence of records, split into partitions. Consumers don’t “take” messages — they read forward at their own offset, and the records stay for a retention window (hours, days, or forever). Multiple independent consumers can read the same log at different positions, and you can replay from any offset. Think of it as a durable event ledger rather than a task list.
| Kafka (log) | RabbitMQ (broker) | SQS (managed queue) | |
|---|---|---|---|
| Model | Append-only log | Routing broker | Managed broker |
| After consumption | Retained (replayable) | Deleted on ack | Deleted on delete |
| Ordering | Per partition | Per queue (best effort) | FIFO queues only |
| Throughput | Very high | High | High (elastic) |
| Replay | Yes (by offset) | No | No |
| Routing logic | Minimal (topic/partition) | Rich (exchanges) | Minimal |
| Ops burden | High (self-run) | Medium | None (AWS-managed) |
A quick way to choose: Kafka when you want a durable, replayable stream that many systems read (event sourcing, analytics, audit). RabbitMQ when you want rich routing and per-message work semantics. SQS when you want a dead-simple managed queue and don’t want to run a broker at all.
Delivery guarantees
No distributed queue can promise true exactly-once delivery (we’ll prove why in the idempotency lesson). What they offer is:
- At-most-once — deliver and forget; if the consumer crashes mid-process, the message is lost. Acceptable only when losing a message is fine (e.g., a best-effort metric).
- At-least-once — the message is redelivered until the consumer acks success. If the consumer crashes after doing the work but before acking, you get a duplicate. This is the common, safe default.
The practical consequence: design every consumer to be idempotent, because at-least-once means duplicates will happen. The mechanics for that are the next big topic.
Ordering, partitions, and consumer groups
Ordering is only guaranteed within a single partition (Kafka) or a single
FIFO queue (SQS). Across partitions, all bets are off. So if order matters — say,
all events for one user — you route them to the same partition by a key
(e.g., hash of userId). Same key, same partition, ordered.
Consumer groups are how you scale reads while preserving that order. Kafka assigns each partition to exactly one consumer in a group, so you parallelize across partitions without two consumers fighting over the same one. Want more parallelism? Add partitions. Want a second independent reader of everything (analytics alongside billing)? Use a different consumer group — it gets its own offsets and reads the full log.
Backlog and replay
Because a log retains messages, a slow or crashed consumer simply builds lag (its offset falls behind the head) and catches up later — nothing is lost. And because you can rewind the offset, you can replay history: reprocess a day of events after fixing a bug, or bootstrap a brand-new consumer from the beginning. A broker, by contrast, has no replay — once a message is acked and deleted, it’s gone.
The JavaScript angle
From Node, you produce and consume with client libraries: kafkajs for Kafka,
amqplib for RabbitMQ, the AWS SDK for SQS. The shape is always the same — a
producer publishes, a long-running worker consumes and acks.
import { Kafka } from 'kafkajs';
const kafka = new Kafka({ clientId: 'orders', brokers: ['localhost:9092'] });
// Producer: key by userId so a user's events keep order in one partition.
const producer = kafka.producer();
await producer.connect();
await producer.send({
topic: 'orders',
messages: [{ key: 'user-42', value: JSON.stringify({ orderId: 'o1' }) }],
});
// Consumer: part of a group, so partitions are split across instances.
const consumer = kafka.consumer({ groupId: 'fulfillment' });
await consumer.connect();
await consumer.subscribe({ topic: 'orders', fromBeginning: false });
await consumer.run({
eachMessage: async ({ message }) => {
const order = JSON.parse(message.value.toString());
await fulfill(order); // make this idempotent — redelivery can happen
},
}); The queue gives one producer many independent consumers, but it’s still a work-distribution tool: each message goes to one consumer in a group. When you instead want every subscriber to get every message, you want publish/subscribe — next.