Top-K Elements with a Heap

A Size-K Min-Heap Finds the K Largest in O(n log k) — Far Better Than Sorting

Top-K Elements with a Heap

Maintaining a min-heap of size k while scanning n elements finds the top-k largest, most frequent, or closest in O(n log k) time and O(k) space.

4 min read Level 3/5 #dsa#heap#top-k
What you'll learn
  • Implement the top-k largest pattern using a size-k min-heap
  • Explain why O(n log k) beats O(n log n) sorting when k is small
  • Adapt the pattern for top-k frequent elements using a frequency map

Finding the k largest elements by sorting costs O(n log n). With a size-k min-heap you scan the array once and the heap does all the bookkeeping: O(n log k) time, O(k) extra space. When k is much smaller than n this is a significant win.

The Pattern

Keep a min-heap of the k largest values seen so far. For each new element:

  1. If the heap has fewer than k elements, push unconditionally.
  2. Otherwise, if the new element is larger than the heap minimum (peek), pop the minimum and push the new element.
  3. After the scan, the heap contains exactly the k largest elements.

The minimum of the heap is always the k-th largest element overall.

class MinHeap {
  /* ... full MinHeap implementation from the previous lesson ... */
}

function topKLargest(nums, k) {
  const heap = new MinHeap();

  for (const num of nums) {
    heap.push(num);
    if (heap.size > k) heap.pop(); // evict the smallest
  }

  // Drain the heap into a sorted result (largest last)
  const result = [];
  while (heap.size) result.unshift(heap.pop());
  return result;
}

console.log(topKLargest([3, 1, 5, 12, 2, 11, 7], 3));
// [7, 11, 12]

Top-K Frequent Elements

For frequency problems, build a count map first, then apply the same pattern on (frequency, element) pairs:

function topKFrequent(words, k) {
  // Step 1: count frequencies
  const freq = new Map();
  for (const w of words) freq.set(w, (freq.get(w) ?? 0) + 1);

  // Step 2: size-k min-heap ordered by frequency
  const heap = new PriorityQueue((a, b) => {
    if (a.count !== b.count) return a.count - b.count; // lower freq loses
    return b.word.localeCompare(a.word);               // tie-break: lex desc
  });

  for (const [word, count] of freq) {
    heap.enqueue({ word, count });
    if (heap.size > k) heap.dequeue();
  }

  const result = [];
  while (!heap.isEmpty()) result.unshift(heap.dequeue().word);
  return result;
}

const words = ['apple','banana','apple','cherry','banana','apple'];
console.log(topKFrequent(words, 2)); // ['apple', 'banana']

Complexity Comparison

ApproachTimeSpace
Sort all, take first kO(n log n)O(n)
Size-k min-heapO(n log k)O(k)
QuickSelect (kth only)O(n) avgO(1)

The heap approach shines in streaming scenarios where n is unknown in advance or too large to store entirely — you only ever hold k elements in memory.

Closest K Points to Origin

The same pattern adapts to distance problems. Use a max-heap of size k keeping the k smallest distances; evict the largest whenever size exceeds k:

function closestK(points, k) {
  const maxHeap = new PriorityQueue((a, b) => b.dist - a.dist);

  for (const [x, y] of points) {
    const dist = x * x + y * y; // skip sqrt, monotone
    maxHeap.enqueue({ point: [x, y], dist });
    if (maxHeap.size > k) maxHeap.dequeue();
  }

  return [...Array(maxHeap.size)].map(() => maxHeap.dequeue().point);
}

Up Next

Take the two-heap idea further: maintain two heaps simultaneously to compute a running median in O(log n) per insertion.

Running Median with Two Heaps →