Content Delivery Networks

Push Bytes to the Edge, Closer Than Your Origin Ever Could Be

Content Delivery Networks

Edge caching, origin shielding, push vs pull CDNs, and the Cache-Control/ETag headers that drive them — plus setting cache headers in Express and busting caches with hashed filenames.

8 min read Level 3/5 #system-design#cdn#caching
What you'll learn
  • Explain edge caching, origin shielding, and push vs pull CDNs
  • Control caching with Cache-Control and ETag headers
  • Decide what belongs on a CDN and bust caches with hashed filenames

Recall the latency table: a cross-region round trip is ~150ms — a million times slower than RAM. If your origin servers live in Virginia and your user is in Singapore, every request pays that ocean crossing. A Content Delivery Network (CDN) fixes this by putting copies of your content on servers physically near your users — at hundreds of edge locations worldwide — so the bytes travel meters instead of continents.

A CDN is a globally distributed cache. The user’s request is answered by the nearest edge node, and only the rare cache miss travels back to your origin.

Content Delivery Networks — architecture diagram

Edge caching and origin shielding

The win is twofold. Latency drops because content is served from nearby. And origin load drops because the edge absorbs the overwhelming majority of requests — your origin only sees cache misses.

Origin shielding takes this further. Instead of every one of hundreds of edge locations independently calling your origin on a miss, the CDN designates an intermediate shield layer. Edges fetch from the shield; only the shield fetches from your origin. So a globally viral asset hits your origin once, not once-per-edge. Under a traffic spike, the shield is what stops a thundering herd of edge misses from flattening your servers.

Push vs pull CDNs

There are two ways content gets onto the edge:

Pull CDNPush CDN
How content arrivesLazily, on first request (miss → fetch from origin)You upload/publish it to the CDN ahead of time
First requestSlow (one miss pays the origin trip)Fast (already there)
Best forLarge, changing catalogs; you cache what’s actually requestedA known, finite set of assets you want pre-warmed
Ops burdenLow — set headers, point DNS, doneHigher — you manage publishing/invalidation

Pull is the common default: you keep content on your origin, set cache headers, and the CDN populates itself on demand. Push suits cases where you want assets in place before the first request — a big launch, or content too large to want a slow first miss. Many real setups are pull-based with selective pre-warming of critical assets.

The headers that drive caching: Cache-Control and ETag

A CDN (and the browser) decide what to cache, and for how long, based on HTTP response headers your origin sends. Two matter most.

Cache-Control sets the policy:

  • public, max-age=31536000, immutable — cache for a year, everywhere, and never revalidate (use this for hashed, never-changing assets).
  • public, max-age=60 — cache for a minute; good for cacheable but occasionally-changing API responses.
  • no-store — never cache (personalized or sensitive responses).
  • s-maxage=300 — a CDN-specific TTL that overrides max-age for shared caches only, letting the edge cache longer than the browser.

ETag is a content fingerprint for revalidation. When a cached copy expires, the client/CDN sends If-None-Match: "<etag>"; if the content hasn’t changed, the origin replies 304 Not Modified with an empty body — saving the bandwidth of re-sending an identical payload.

Setting cache headers in Express script.js
import express from 'express';
const app = express();

// Hashed static assets: cache hard, forever. The hash in the filename
// guarantees a new URL when content changes, so 'immutable' is safe.
app.use('/assets', express.static('public/assets', {
  maxAge: '1y',
  immutable: true,                       // Cache-Control: public, max-age=31536000, immutable
}));

// A cacheable API response: short TTL, longer at the CDN, with an ETag
// so revalidation is cheap (304s instead of full bodies).
app.get('/api/popular', async (req, res) => {
  const data = await getPopularPosts();
  res.set('Cache-Control', 'public, max-age=60, s-maxage=300');
  res.json(data);                        // Express adds an ETag automatically
});

// Per-user / sensitive data: never let an edge or browser cache it.
app.get('/api/me', (req, res) => {
  res.set('Cache-Control', 'private, no-store');
  res.json(getCurrentUser(req));
});

app.listen(3000);
▶ Preview: console

What to put on a CDN — and what not to

Put it on the CDNKeep it off the CDN
Static assets: JS, CSS, fonts, imagesPer-user, personalized responses
Video/audio segments, large downloadsAnything with Set-Cookie / auth state
Public, cacheable API responses (with short TTL)Write requests (POST/PUT/DELETE)
Anything identical for all usersReal-time / rapidly-changing data

The rule of thumb: cache what’s identical across users and changes slowly. Static assets are the obvious, huge win. Public API responses can go on a CDN too with a short s-maxage, but the moment a response varies per user or carries auth, push it to no-store — a leaked cached response is a security incident.

The JavaScript angle: cache busting with hashed filenames

There’s a tension: you want assets cached for a year for speed, but you also need users to get the new version when you deploy. The answer is content hashing in the filename — and every JS build tool (Vite, webpack, esbuild, Astro) does it for you.

Your bundler emits app.4f9a2c1.js where 4f9a2c1 is a hash of the file’s contents. Change one byte of source and the hash — and therefore the URL — changes. So you cache each file immutable for a year safely: a new build simply references a new URL, which is an automatic cache miss that fetches the fresh file. The old URL can stay cached forever; nothing points at it anymore.

The HTML always points at the current hash index.html
<!-- Generated by your bundler; the hash changes only when contents change. -->
<script type="module" src="/assets/app.4f9a2c1.js"></script>
<link rel="stylesheet" href="/assets/main.b21e007.css" />
▶ Preview: console

This is why the immutable + 1-year Cache-Control in the Express example is safe: the hashed filename is the cache-busting mechanism. The one file you don’t hash and don’t cache hard is index.html itself — it must be fetched fresh so it can hand out the latest asset URLs.

CDNs and hashed filenames push static content to the edge. But most apps spend their time on dynamic data — and for that we need caching strategies inside the app, which is the next lesson.