Union Find

01 / The one-sentence essence
Maintain a partition of N elements into disjoint components; collapse two components into one in near-constant time. A forest of trees where the root is the component identity, and every find shortens the tree it touches.
Problemcount connected components via DSUNodes6Edges5
0×11×12×13×14×15×1
Start: 6 nodes, each its own component. Process edges one by one — each union merges two trees.
step
0 / 16
edge
action
init
components
6
0 / 16

02 / The pattern signature

# two arrays + two operationsparent[i] i  # everyone is their own rootsize[i] 1 def find(x):while parent[x] ≠ x:parent[x] parent[parent[x]]  # compressx parent[x]return x def union(a, b):ra, rb find(a), find(b)if ra == rb: return Falseif size[ra] < size[rb]: ra, rb rb, raparent[rb] ra  # smaller under largersize[ra] += size[rb]return True

03 / When to recognize this pattern

"connected components"
Count or query whether a set of elements falls into the same group. Equivalence relation: reflexive, symmetric, transitive. If you'd naively run BFS/DFS from every node, suspect Union-Find.
"merge two groups"
The problem incrementally combines clusters — friends becoming friends-of-friends, edges added one at a time, accounts that share an email. The structure of the merges doesn't matter, only the membership.
"cycle detection in undirected graph"
Add edges one at a time; the edge that connects two already-in-the-same-component nodes closes a cycle. Used in Kruskal's MST and LC 684.
"offline queries"
When you can process queries in a batch (instead of online), DSU can answer "is x connected to y after these k operations?" in near-constant time per query.

04 / Common pitfalls

i.
Skipping the two optimizations.
A DSU without path compression OR union-by-size/rank degrades to O(N) per operation in the worst case (a left-leaning chain). With either optimization you get O(log N); with both you get α(N). Always include both — the code is ten lines either way.
ii.
Forgetting to find before comparing.
To check "are a and b connected?" you must compare find(a) == find(b), not parent[a] == parent[b]. The latter checks immediate parents — which only works after a global flatten.
iii.
Mutating during iteration.
Path compression rewrites the parent array during find. If you iterate parent directly while calling find on each entry, you'll see partial updates. Either snapshot first, or only call find when you genuinely need a fresh root.

05 / Go practice — on LeetCode

Four problems across the family. The first is the canonical "count provinces" introduction; the next two add structure (cycle detection, weighted edges); the fourth shifts to string-keyed nodes — a hash-map wrapper around the same DSU core.