Heap & Priority Queue
01 / The one-sentence essence
To track the k most extreme elements out of N, keep a heap of size k whose root is the worst of the k — every new element either kicks the root out or doesn't.
Problemtop k largest from a streamStream[3, 1, 5, 12, 2, 11, 7, 8]k3
stream
30
11
52
123
24
115
76
87
— heap is empty —
heap = [ ]
Keep a min-heap of size k = 3. The root is the smallest of the top-k — any new element strictly larger kicks the root out.
step
0 / 17
heap size
0
min (root)
—
action
init
0 / 17
02 / The pattern signature
# top-k largest — min-heap of size kheap ← [ ]for x in stream:if |heap| < k:heap.push(x)elif x > heap[0]:heap.replace_root(x)return sort(heap, descending)# top-k smallest: max-heap and the dual comparison
03 / When to recognize this pattern
"top k / kth"
You need the k most or least extreme — not the full sorted order. The whole point is to spend less than
O(N log N)."streaming / online"
Elements arrive one at a time and you can't see them all at once. A heap of size k gives a constant-memory answer no matter how long the stream is.
"merge / multi-source"
You have several sorted sequences and need the next-smallest across all of them. A min-heap of (value, source) pairs runs in
O(N log k) total."frequency / weighted"
The "extreme" relation isn't on the raw value — it's on a derived key like a count or a distance. Same algorithm, different comparator.
04 / Common pitfalls
i.
Using a max-heap to find the largest.
For top-k largest you want a min-heap of size k — the root is the smallest of the keepers, and incoming larger elements kick it out. A max-heap of size k would put the biggest seen so far on top and never let you cheaply check the kth element.
ii.
Strict
> vs non-strict ≥ when replacing.With strict
>, an incoming element equal to the root is rejected — the existing duplicate stays. With ≥, it gets cycled in. The two produce the same set on inputs without duplicates; on duplicates, they differ. Match your problem.iii.
Reaching for a heap when sort is fine.
If you already have all N elements in memory and k is close to N, sorting is simpler and competitive. The heap wins when N ≫ k or the data is streaming and you can't buffer all of it.
05 / Go practice — on LeetCode
Four problems. The first is the canonical kth-largest. The second adds a frequency key — same algorithm, different comparator. The third stresses the streaming setup. The fourth is where the heap really earns its keep.