Union Find
01 / The one-sentence essence
Maintain a partition ofNelements into disjoint components; collapse two components into one in near-constant time. A forest of trees where the root is the component identity, and everyfindshortens the tree it touches.
Problemcount connected components via DSUNodes6Edges5
Start: 6 nodes, each its own component. Process edges one by one — each union merges two trees.
step
0 / 16
edge
—
action
init
components
6
0 / 16
02 / The pattern signature
# two arrays + two operationsparent[i] ← i # everyone is their own rootsize[i] ← 1 def find(x):while parent[x] ≠ x:parent[x] ← parent[parent[x]] # compressx ← parent[x]return x def union(a, b):ra, rb ← find(a), find(b)if ra == rb: return Falseif size[ra] < size[rb]: ra, rb ← rb, raparent[rb] ← ra # smaller under largersize[ra] += size[rb]return True
03 / When to recognize this pattern
"connected components"
Count or query whether a set of elements falls into the same group. Equivalence relation: reflexive, symmetric, transitive. If you'd naively run BFS/DFS from every node, suspect Union-Find.
"merge two groups"
The problem incrementally combines clusters — friends becoming friends-of-friends, edges added one at a time, accounts that share an email. The structure of the merges doesn't matter, only the membership.
"cycle detection in undirected graph"
Add edges one at a time; the edge that connects two already-in-the-same-component nodes closes a cycle. Used in Kruskal's MST and LC 684.
"offline queries"
When you can process queries in a batch (instead of online), DSU can answer "is x connected to y after these k operations?" in near-constant time per query.
04 / Common pitfalls
i.
Skipping the two optimizations.
A DSU without path compression OR union-by-size/rank degrades to
O(N) per operation in the worst case (a left-leaning chain). With either optimization you get O(log N); with both you get α(N). Always include both — the code is ten lines either way.ii.
Forgetting to find before comparing.
To check "are a and b connected?" you must compare
find(a) == find(b), not parent[a] == parent[b]. The latter checks immediate parents — which only works after a global flatten.iii.
Mutating during iteration.
Path compression rewrites the
parent array during find. If you iterate parent directly while calling find on each entry, you'll see partial updates. Either snapshot first, or only call find when you genuinely need a fresh root.05 / Go practice — on LeetCode
Four problems across the family. The first is the canonical "count provinces" introduction; the next two add structure (cycle detection, weighted edges); the fourth shifts to string-keyed nodes — a hash-map wrapper around the same DSU core.