# MST

PDF of Eric’s handwritten notes are here.

Greedy strategies are a very natural approach to solve a problem: we take the choice which looks locally optimal. However, we are generally concerned with a global optimum. So, when do these locally optimal moves produce a global optimum?

We can think of a greedy algorithm as: given the choices we have made so far, make the choice that provides the most benefit now. We will see one example of a problem (computing the minimum spanning tree of a graph) where a greedy algorithm works well.

Graph Problems:

Consider the following general framework for graph problems.

Input: An undirected graph $G=(V,E), |V|=n, |E|=m$ with $E=\{(u_1,v_1), \ldots, (u_m, v_m)\}$ where each $u_i, v_i \in V$. We also have weights on all the edges, given by $w: E \rightarrow \mathbb{R}$.

Problem: Find a subset of edges, vertices (i.e. a subgraph) of $G$ of minimum total weight that satisfies property $P$.

Property $P$ could be one of many possibilities, e.g., “cycle” or “path” or “spanning”. Some will have efficient solutions, others will not. We consider a few special properties which are solvable via a greedy algorithm.

Minimum Spanning Tree (MST):

Suppose we consider $P$: “minimal size, connected, spanning” subgraph. A connected subgraph means every vertex is reachable from every other vertex, and a spanning subgraph means that it includes every vertex from $G$.

A “minimal size, connected, spanning” subgraph is in fact a spanning tree of $G$.

Claim: A minimal size subset of edges $F\subseteq E$ that connects every pair of vertices is a spanning tree. That is,

1. $F$ has no cycles
2. $F$ has $|V|-1=n-1$ edges; i.e. $|F|=n-1$.

Proof: (1) Suppose $F$ had a cycle. Then, we can remove any edge on the cycle while maintaining connectivity between every pair of vertices. Therefore, there can be no cycles if $F$ is minimal.

(2) We prove by induction that $|F|=n-1$. It’s clearly true for the base cases $n=1,2$. Now, for $n>2$, remove any edge in $F$. Then, $F \backslash \{e\}$ has $2$ components (otherwise $F$ was not minimal), and each component has $n_1, n_2 vertices with $n_1 + n_2 = n$. By induction,

$|F| = (n_1-1) + (n_2-1) + 1 = n-1$.

Problem (MST): Find the minimum (or maximum) weight spanning tree of $G$.

How to solve?

Let’s first consider a natural greedy algorithm to solve this problem. One possible approach is to simply select the lightest weight edge. Locally, this seems like the best edge to select. However, there is a potential problem: what if the lightest weight edge creates a cycle? Well, then we simply ignore that edge and proceed to the next lightest edge. We have essentially described Kruskal’s Algorithm:

Kruskal’s Algorithm for MST:

$KRUSKAL(V,E,w)$:

• Sort edges in increasing order of weight; so have an ordering $e_1, \ldots, e_m$ such that $w(e_i) \le w(e_{i+1})$.
• Set $T = \{\}$
• for $i=1$ to $m$
• Add $e_i$ to $T$ if it doesn’t create a cycle.
• Return $T$.

Theorem(Kruskal): Kruskal’s algorithm produces a minimum weight spanning tree of $G=(V,E)$.

The proof of the theorem follows almost immediately from a few helpful observations. First, define a cut $(S,\bar S)$ as the set of edges between a subset of vertices $S \subseteq V$ and its complement $\bar S = V \backslash S$. That is,

$(S,\bar S) = \{(u_i, v_j) \in E | u_i \in S, v_j \in \bar S\}$.

Lemma 1: Fix any edge $e$ whose weight is the minimum in some cut $(S,\bar S)$. Then, there exists an MST $T$ containing $e$.

Proof: Suppose $e$ is not in an MST. Consider an MST $T$ and add $e$ to $T$. Then, $e$ induces a cycle $C$ (since $T \cup \{e\}$ has two paths between endpoints of $e$).

Consider the cut $(S,\bar S)$ for which $e$ is a minimum weight edge. Then $C$ must cross $(S,\bar S)$ in some other edge $f$ with $weight(e) \le weight(f)$.

Set $T' = T \cup \{e\} \backslash \{f\}$. Note that $weight(T') \le weight(T)$, which means $T'$ is an MST (since it connects every pair of vertices and is minimum weight).

Lemma 2: Let $T_K$ be the tree found my Kruskal’s algorithm. For each edge $e \in T_K$, there exists a cut $(S, \bar S)$ such that

1. $e$ is the minimum weight edge of $(S, \bar S)$
2. $e$ is the unique edge in $T_K \cap (S, \bar S)$.

Proof: Consider $T_K \backslash \{e\}$, with $e = (u,v)$. Then, $u,v$ are in different components $(S, \bar S)$ and $e$ is the only edge of $(S, \bar S)$ in $T_K$. Also, at the point when $e$ was added, $u$ and $v$ were not connected, which means that $e$ is the minimum weight edge on $(S, \bar S)$ (otherwise, Kruskal’s algorithm would have added another edge before $e$).

Proof (of Theorem): Using Lemmas 1 and 2, we can now prove that Kruskal’s algorithm always produces an MST.

Suppose $T_K$ is not an MST, and let $T$ be an MST. Order the edges of $T_K$ by increasing weight. Take the first edge $e$ of $T_K$ that is not in $T$. Thus $T \cup \{e\}$ has a cycle $C$. Let $(S, \bar S)$ be the cut for $e$ given by Lemma 2. Now, remove $f \in C \cap T$ that crosses $(S, \bar S$. We have that $weight(e) \le weight(f)$ and $T \cup \{e\} \backslash \{f\}$ is a spanning tree.

Repeat this process for all edges of $T_K$ not in $T$, and thus $T_K$ is an MST.

The key idea of the proof is sometimes called the cut property.

Lemma 3 (cut property). Let $X$ be a subset of edges of an MST of a graph $G$. For any edge $e$ such that (1) $e$ is a minimum-weight edge of a cut $(S, \bar{S})$ and (2) $X \cap (S, \bar{S}) = \phi$, there is an MST $T$ s.t. $X \cup \{e \} \subseteq T$.

Using the cut property, we can prove that the following algorithm, called Prim’s algorithm, also produces an MST.

Running time of Kruskal’s algorithm: $O(m \log n)$.

Sorting the edges once up front takes $O(m \log n)$.

However, we also have to keep track of whether an added edge creates a cycle, i.e. when an edge connects two vertices in the same connected component. We can do this via a simple data structure.

For each vertex, introduce a component number, which is initially just the index of that vertex. When Kruskal’s algorithm adds an edge which connects two vertices with the same component number, we skip that edge because it creates a cycle. If the component numbers of the two vertices are different, then we relabel the vertices in the smaller component to the label of the larger component. Since we relabel the smaller component, the size of a vertex’s component at least doubles each time it is relabeled. Therefore a vertex’s label changes at most $O( \log n)$ times. Thus the total cost of this data structure is $O(n \log n)$.

Prim’s Algorithm:

Another algorithm to find the MST of a graph is Prim’s algorithm.

Running time of Prim’s algorithm: $O(m \log n)$.