Indexing

What an index is and what it costs, B-tree vs LSM-tree, composite and covering indexes, and when a full table scan is actually fine.

9 min read Level 3/5 #system-design#databases#indexing

What you'll learn

Explain what an index is and the write/space cost it adds
Contrast B-tree and LSM-tree storage engines
Design composite and covering indexes for a query

An index is a separate data structure the database maintains so it can find rows without scanning the whole table. It’s the book’s index versus reading every page. Without one, finding a user by email means reading every row — an O(n) scan. With one, it’s a handful of pointer hops — roughly O(log n).

That speedup isn’t free, and understanding the bill is what separates “add an index” as a reflex from adding the right index on purpose.

What an index costs

Every index is a copy of some columns, kept sorted, that the database must update on every write. So:

Writes get slower. Insert one row, and the DB updates the table and every index on it. Ten indexes means eleven structures touched per insert — this is write amplification.
Space grows. Each index is extra on-disk data; heavily-indexed tables can have indexes larger than the table itself.
The optimizer has more to consider. Too many overlapping indexes can even confuse the query planner.

The rule of thumb: index the columns you filter, join, and sort on — and nothing else. An unused index is pure cost with no benefit.

B-tree vs LSM-tree

Two storage-engine families dominate, and the difference explains why some databases are read-optimized and others write-optimized.

A B-tree (Postgres, MySQL/InnoDB) keeps data in a balanced tree of fixed-size pages, updated in place. Lookups are a few page reads — excellent, predictable read latency. The cost is that random in-place writes mean random disk I/O, and writes can be slower under heavy load.

An LSM-tree (log-structured merge-tree — Cassandra, RocksDB, LevelDB, the engine under many wide-column and key-value stores) buffers writes in memory, then flushes them as sorted, immutable files appended to disk. Writes become fast sequential appends (recall from the latency lesson: sequential I/O crushes random I/O). The catch: a read may have to check several files, and a background compaction process must continually merge them.

	B-tree	LSM-tree
Writes	in-place, random I/O	sequential appends — fast
Reads	predictable, few hops	may scan several files
Write amplification	lower per write	higher (compaction)
Best for	read-heavy, OLTP	write-heavy ingestion
Examples	Postgres, InnoDB	Cassandra, RocksDB

This is why write-heavy stores reach for LSM-trees: turning random writes into sequential appends is the single biggest throughput lever available on the write path.

Composite and covering indexes

A composite index covers multiple columns in order — and order matters. An index on (status, created_at) accelerates queries that filter on status (optionally also ranging on created_at), but does not help a query that filters on created_at alone. Think of a phone book sorted by last name then first name: useless for finding everyone named “Ada.”

A covering index includes every column a query needs, so the database answers the query from the index without touching the table at all — an index-only scan. It’s the fastest a query can go short of a cache hit.

A covering index answers the query from the index alone script.js

-- Hot query: most recent open orders for a user, just id + total.
SELECT id, total
FROM orders
WHERE user_id = $1 AND status = 'open'
ORDER BY created_at DESC
LIMIT 20;

-- Composite + covering: filter cols first, sort col next, payload INCLUDEd.
-- The DB walks straight to this user's open orders, already sorted,
-- and reads id/total right out of the index — zero table lookups.
CREATE INDEX idx_orders_user_open
  ON orders (user_id, status, created_at DESC)
  INCLUDE (id, total);

▶ Preview: console

When a full scan is fine

Indexes aren’t always the answer. A full scan is the right call when:

The table is small — scanning 500 rows is faster than the index indirection.
The query touches most of the table anyway (a report over 80% of rows).
It runs rarely — a nightly batch job doesn’t justify a write-path tax.

The query planner often chooses a scan in these cases even if an index exists, because it estimates the scan is cheaper. Trust EXPLAIN over your instinct.

The JavaScript angle

Node engineers meet indexing most viscerally through an ORM, where a single innocent line of JS generates a query whose performance lives or dies by whether an index exists:

The query is one line; the index is the whole story script.js

// Prisma: looks like an in-memory filter, but it's SQL hitting disk.
const orders = await prisma.order.findMany({
  where: { userId, status: 'open' },     // ← needs an index on (userId, status)
  orderBy: { createdAt: 'desc' },        // ← ...ideally including createdAt
  take: 20,
});
// With the composite index: a few milliseconds.
// Without it: a full-table scan that gets slower every day the table grows.

▶ Preview: console

The takeaway for the JS dev: the ORM hides the index, not the cost. Read the queries your code emits, run EXPLAIN ANALYZE, and index the access patterns that run millions of times a day.

Indexes make a single node fast. The next problem is surviving when that node dies and serving reads from more than one machine — replication.