API Gateway

One Front Door for Many Services

API Gateway

What an API gateway actually does, how it differs from a reverse proxy and a load balancer, and the Backend-for-Frontend pattern that grows out of it.

8 min read Level 3/5 #system-design#api-gateway#bff
What you'll learn
  • List the cross-cutting concerns a gateway centralizes
  • Distinguish gateway, reverse proxy, and load balancer
  • Explain when a Backend-for-Frontend is worth the extra hop

Once you have more than one service, a question appears: where do all the cross-cutting concerns live? Every service needs auth, rate limiting, TLS termination, logging, and request routing — and you do not want to copy-paste that into all twenty of them. The answer is an API gateway: a single entry point that sits in front of your services and handles the work that’s the same for everyone.

The client talks to one host. Behind it, the gateway routes each request to the right service, applies the policies, and forwards it on.

API Gateway — architecture diagram

What the gateway does

A gateway is where you centralize the concerns that would otherwise be duplicated. The usual set:

  • Routing — map an incoming path or host to the right upstream service.
  • Authentication & authorization — verify the token once, at the edge, so downstream services trust an already-authenticated request.
  • Rate limiting & quotas — the limiter from the last lesson, enforced before traffic ever reaches a service.
  • TLS termination — decrypt HTTPS here so internal hops can be cheaper plaintext (inside a trusted network) or re-encrypted as needed.
  • Request aggregation — fan one client request out to several services and stitch the responses into one payload, saving the client round trips.
  • Observability — a single, consistent place to log, trace, and meter every request entering the system.

By pulling these out of the services, each service shrinks to just its business logic — which is the whole point of the pattern.

Gateway vs reverse proxy vs load balancer

These three overlap and get conflated constantly. The honest distinction is about scope and intelligence, not hard boundaries:

ComponentPrimary jobKnows about
Load balancerSpread traffic across identical instancesHow many instances, are they healthy
Reverse proxyFront one or more backends, terminate TLS, cacheWhere backends are
API gatewayRoute + apply API policy per routeWhat the API is — auth, limits, aggregation

A load balancer (L4 or L7) answers “which of these N identical app servers should this request go to?” A reverse proxy (nginx, Caddy) sits in front of backends to terminate TLS, cache, and forward — it’s a superset of a load balancer’s job. An API gateway is a reverse proxy that’s API-aware: it understands routes, auth schemes, and per-endpoint policy. In practice you often have all three stacked — a load balancer in front of a gateway in front of your services — and tools like nginx can play more than one role.

The Backend-for-Frontend pattern

A single gateway serving every client eventually strains, because a web app, a mobile app, and a public partner API want very different things from the same backend. Mobile wants tiny, battery-friendly payloads; the web app wants rich ones; the partner API wants a stable, versioned contract. Cramming all three into one gateway makes it a tangle of if client === 'mobile'.

The Backend-for-Frontend (BFF) pattern gives each client type its own gateway, tailored to it:

API Gateway — architecture diagram

Each BFF aggregates and shapes the downstream services for exactly one frontend. The mobile BFF returns a trimmed payload in one call; the web BFF returns the full object. Each frontend team owns its BFF, so they can move fast without coordinating on a shared, do-everything gateway.

The JavaScript angle

The BFF is the pattern where Node shines — it’s I/O-bound aggregation, which is exactly what the event loop is good at. A BFF endpoint fans out to several services in parallel and reshapes the result:

A BFF endpoint aggregating services script.js
// Mobile home screen needs profile + recent orders + unread count.
// One client call → three parallel internal calls → one trimmed payload.
app.get('/mobile/home', async (req, res) => {
  const { userId } = req.auth; // already verified at the gateway

  const [profile, orders, unread] = await Promise.all([
    fetch(`${USERS}/users/${userId}`).then((r) => r.json()),
    fetch(`${ORDERS}/orders?user=${userId}&limit=3`).then((r) => r.json()),
    fetch(`${NOTIFS}/unread-count/${userId}`).then((r) => r.json()),
  ]);

  // Shape it for mobile: small, flat, exactly what this screen renders.
  res.json({
    name: profile.displayName,
    recentOrders: orders.map((o) => ({ id: o.id, total: o.total })),
    badge: unread.count,
  });
});
▶ Preview: console

Promise.all turns three serial round trips into one parallel wait — the latency win from the foundations latency lesson, applied. The client makes one request over the slow mobile network; the fan-out happens inside the datacenter where round trips are cheap.

A gateway is the place to apply resilience — retries, timeouts, circuit breakers — to every downstream call. Those patterns are next.