Minimum Spanning Trees: Kruskal's, Prim's, and Union-Find
Connecting all vertices at minimum total cost with Kruskal's and Prim's algorithms, powered by the union-find data structure.
Terminology
- Spanning tree: a subgraph of a connected, undirected graph that includes all vertices and is a tree (connected, acyclic). A graph with $V$ vertices has a spanning tree with exactly $V - 1$ edges.
- Minimum spanning tree (MST): a spanning tree whose total edge weight is minimized among all possible spanning trees.
- Cut property: for any cut (partition of vertices into two sets), the lightest edge crossing the cut is in some MST. This is the theoretical foundation for both Kruskal's and Prim's.
- Union-Find (Disjoint Set Union): a data structure that tracks a collection of disjoint sets, supporting two operations: find (which set does an element belong to?) and union (merge two sets). Used by Kruskal's to detect cycles.
- Path compression: a Union-Find optimization where, during a find operation, every node on the path to the root is made to point directly to the root.
- Union by rank: a Union-Find optimization where the smaller tree is always attached under the root of the larger tree, keeping the structure balanced.
- Greedy: both Kruskal's and Prim's are greedy algorithms. They make locally optimal edge choices that lead to a globally optimal MST.
- Inverse Ackermann function $\alpha(n)$: the amortized time per operation for Union-Find with path compression and union by rank. It grows so slowly that it is effectively constant ($\leq 4$) for any practical input size.
What & Why
Given a connected, undirected, weighted graph, the minimum spanning tree connects all vertices with the minimum total edge weight. MST is one of the most fundamental graph problems, with applications in network design, clustering, and approximation algorithms.
Two classic algorithms solve MST:
- Kruskal's: sort all edges by weight, then greedily add the cheapest edge that does not create a cycle. Uses Union-Find for cycle detection.
- Prim's: start from any vertex, then greedily add the cheapest edge connecting the current tree to a new vertex. Uses a priority queue.
Both are greedy and both produce optimal MSTs. The choice depends on the graph:
- Kruskal's is better for sparse graphs (
Eclose toV) because it processes edges, and sortingEedges is cheap whenEis small - Prim's is better for dense graphs (
Eclose toV^2) because it processes vertices, and the priority queue stays small
How It Works
Kruskal's Algorithm
- Sort all edges by weight
- Initialize Union-Find with each vertex in its own set
- For each edge (in sorted order): if the two endpoints are in different sets, add the edge to the MST and union the sets
- Stop when the MST has
V - 1edges
Union-Find Data Structure
Union-Find is the engine behind Kruskal's. It supports:
- Find(x): return the representative (root) of the set containing
x - Union(x, y): merge the sets containing
xandy
With path compression and union by rank, both operations run in amortized O(\alpha(n)) time, where \alpha is the inverse Ackermann function (effectively constant).
Prim's Algorithm
Prim's grows the MST from a starting vertex. It maintains a priority queue of edges connecting the current tree to vertices not yet in the tree. At each step, extract the minimum-weight edge, add the new vertex to the tree, and insert all its edges to non-tree vertices into the priority queue.
Complexity Analysis
| Algorithm | Time | Space | Best for |
|---|---|---|---|
| Kruskal's | $O(E \log E)$ | $O(V + E)$ | Sparse graphs |
| Prim's (binary heap) | $O((V + E) \log V)$ | $O(V + E)$ | Dense graphs |
| Prim's (Fibonacci heap) | $O(E + V \log V)$ | $O(V + E)$ | Very dense graphs |
Kruskal's Analysis
Sorting $E$ edges takes $O(E \log E)$. Since $E \leq V^2$, $\log E \leq 2 \log V$, so this is also $O(E \log V)$. Processing each edge with Union-Find takes amortized $O(\alpha(V))$ per operation, totaling $O(E \cdot \alpha(V))$. The sort dominates:
Prim's Analysis
With a binary heap: each vertex is extracted once ($O(V \log V)$) and each edge may trigger a decrease-key ($O(E \log V)$):
Union-Find Amortized Complexity
With both path compression and union by rank, a sequence of $m$ operations on $n$ elements takes:
The inverse Ackermann function $\alpha(n)$ is at most 4 for any $n \leq 2^{2^{2^{65536}}}$, so it is effectively $O(1)$ per operation.
Implementation
Union-Find
ALGORITHM MakeSet(parent, rank, x)
parent[x] = x
rank[x] = 0
END
ALGORITHM Find(parent, x)
IF parent[x] != x THEN
parent[x] = Find(parent, parent[x]) // path compression
END IF
RETURN parent[x]
END
ALGORITHM Union(parent, rank, x, y)
rootX = Find(parent, x)
rootY = Find(parent, y)
IF rootX = rootY THEN RETURN // already in same set
// Union by rank
IF rank[rootX] < rank[rootY] THEN
parent[rootX] = rootY
ELSE IF rank[rootX] > rank[rootY] THEN
parent[rootY] = rootX
ELSE
parent[rootY] = rootX
rank[rootX] = rank[rootX] + 1
END IF
END
Kruskal's Algorithm
ALGORITHM Kruskal(vertices, edges, V, E)
INPUT: V vertices, E edges (each with u, v, weight)
OUTPUT: list of MST edges
SORT edges by weight ascending
CREATE parent and rank arrays of size V
FOR EACH vertex v DO
MakeSet(parent, rank, v)
END FOR
mst = empty list
FOR EACH edge (u, v, w) in sorted edges DO
IF Find(parent, u) != Find(parent, v) THEN
APPEND (u, v, w) to mst
Union(parent, rank, u, v)
IF SIZE(mst) = V - 1 THEN BREAK
END IF
END FOR
RETURN mst
END
Prim's Algorithm
ALGORITHM Prim(graph, V, start)
INPUT: weighted adjacency list graph, V vertices, start vertex
OUTPUT: list of MST edges
CREATE dist array of size V, set all to INFINITY
CREATE inMST array of size V, set all to FALSE
CREATE parent array of size V, set all to -1
dist[start] = 0
CREATE min-priority queue Q
INSERT (0, start) into Q
mst = empty list
WHILE Q is not empty DO
(d, u) = EXTRACT_MIN(Q)
IF inMST[u] THEN CONTINUE
inMST[u] = TRUE
IF parent[u] != -1 THEN
APPEND (parent[u], u, d) to mst
END IF
FOR EACH neighbor v of u with edge weight w DO
IF NOT inMST[v] AND w < dist[v] THEN
dist[v] = w
parent[v] = u
INSERT (w, v) into Q
END IF
END FOR
END WHILE
RETURN mst
END
Real-World Applications
- Network design: laying cable, fiber optic, or pipeline networks to connect all locations at minimum cost. This is the original motivation for MST.
- Clustering: single-linkage hierarchical clustering is equivalent to building an MST and removing the heaviest edges to form clusters.
- Approximation algorithms: the MST provides a 2-approximation for the Traveling Salesman Problem on metric graphs (the MST tour is at most twice the optimal TSP tour).
- Image segmentation: graph-based image segmentation algorithms use MST-like approaches to group similar pixels.
- Circuit design: connecting components on a circuit board with minimum total wire length is an MST problem.
Key Takeaways
- MST connects all vertices in a weighted undirected graph with minimum total edge weight, using exactly $V - 1$ edges
- Kruskal's sorts edges and uses Union-Find for cycle detection, running in $O(E \log E)$. Best for sparse graphs.
- Prim's grows the tree from a starting vertex using a priority queue, running in $O((V + E) \log V)$. Best for dense graphs.
- Union-Find with path compression and union by rank achieves near-constant amortized time per operation
- The cut property guarantees correctness: the lightest edge crossing any cut belongs to some MST