02 Trees, Minimum Spanning Trees

In the realm of graph theory, trees stand out as fundamental structures, embodying simplicity and elegance while possessing rich properties. As we have defined earlier, a tree is a connected, acyclic graph. This seemingly simple definition gives rise to a host of powerful characteristics and algorithms, particularly in the context of spanning trees and minimum spanning trees.

Spanning Trees

A spanning tree of a connected graph $G = (V, E)$ is a subgraph $T = (V, E_{T})$ that is a tree and includes all vertices of $G$ (i.e., $V_{T} = V$ and $E_{T} \subseteq E$ ). In essence, a spanning tree connects all vertices of $G$ using a minimal set of edges, without creating any cycles. Every connected graph contains at least one spanning tree.

Given a connected graph, one way to find a spanning tree is through a process of cycle removal. As outlined in Satz 1.14 (Theorem 1.14), we can start with the original graph and iteratively remove edges that belong to cycles, ensuring that the graph remains connected at each step. This process continues until no cycles remain, at which point the resulting graph is a spanning tree.

Theorem 1.14: If $G = (V, E)$ is a connected graph, the following algorithm finds a spanning tree $T = (V, E_{T})$ with $E_{T} \subseteq E$ :

Algorithm: FindSpanningTree(G)
Input: Connected graph G = (V, E)
Output: Spanning tree T = (V, ET)
 
G' = G
while G' contains a cycle:
    Choose any cycle C in G'
    Remove an arbitrary edge e from cycle C in G'
return G'

Lemma 1.15: If $G = (V, E)$ is a connected graph, $C$ is a cycle in $G$ , and $e$ is an edge in $C$ , then the graph $G_{e} = (V, E ∖ {e})$ (obtained by deleting edge $e$ ) is also connected.

Proof of Lemma 1.15: Consider any two vertices $x, y \in V$ . Since $G$ is connected, there exists a path $P$ between $x$ and $y$ in $G$ . If path $P$ does not contain edge $e$ , then it is also a path in $G_{e}$ , and we are done. If $P$ does contain $e$ , let $e = {u, v}$ . Since $e$ is part of a cycle $C$ , there exists an alternative path in $G_{e}$ between $u$ and $v$ along the remaining edges of $C$ . Therefore, we can replace the edge $e$ in path $P$ with this alternative path to obtain a path between $x$ and $y$ in $G_{e}$ . Thus, $G_{e}$ remains connected.

Proof of Theorem 1.14 (Correctness): Initially, $G^{'}$ is connected. By Lemma 1.15, removing a cycle edge preserves connectivity. The algorithm terminates when $G^{'}$ is acyclic (contains no cycles). At termination, $G^{'}$ is connected and acyclic, hence a tree. Since we only remove edges from the original graph $G$ , the resulting tree $T = G^{'}$ is a spanning tree of $G$ .

Minimum Spanning Trees (MSTs)

In many applications, edges in a graph are associated with costs or weights. Given a connected graph $G = (V, E)$ and a cost function $c : E \to R$ , we are often interested in finding a minimum spanning tree (MST), which is a spanning tree $T = (V, E_{T})$ such that the sum of costs of edges in $T$ is minimized.

e \in E_{T} \sum c (e) =! min

Cut and Cycle Properties for MSTs

To develop algorithms for finding MSTs, we consider two key properties related to cuts and cycles in graphs.

Lemma 1.16 (Cut Lemma): Let $G = (V, E)$ be a connected graph, and $(S, V ∖ S)$ be a cut in $G$ . For any edge $e^{*}$ in the cut $E (S, V ∖ S)$ with the minimum cost among all edges in the cut, there exists a minimum spanning tree that includes $e^{*}$ . If $e^{*}$ is the unique minimum cost edge in the cut, then every minimum spanning tree must include $e^{*}$ .

Lemma 1.17 (Cycle Lemma): Let $G = (V, E)$ be a connected graph, and $C$ be a cycle in $G$ . For any edge $e^{*}$ in $C$ with the maximum cost among all edges in $C$ , there exists a minimum spanning tree that does not include $e^{*}$ . If $e^{*}$ is the unique maximum cost edge in the cycle, then no minimum spanning tree can contain $e^{*}$ .

These lemmas provide the theoretical basis for MST algorithms. The Cut Lemma suggests a “greedy” approach of always selecting the cheapest edge across a cut, while the Cycle Lemma suggests avoiding the most expensive edge in a cycle.

Generic Algorithm: Blue and Red Rules

These Cut and Cycle Lemmas inspire a generic algorithm for finding MSTs based on coloring edges either blue or red.

Blue Rule:

Condition: Select a cut in the graph that has no blue edges crossing it.
Action: Color a minimum cost uncolored edge crossing the cut blue.

Red Rule:

Condition: Select a cycle in the graph that has no red edges.
Action: Color a maximum cost uncolored edge in the cycle red.

Theorem 1.18: For any connected graph $G = (V, E)$ with cost function $c : E \to R$ , any sequence of applications of the Blue and Red Rules will eventually color all edges. The blue edges will form a minimum spanning tree.

This theorem provides a powerful framework, but it is not yet a concrete algorithm. We need to specify how to efficiently choose cuts and cycles and how to apply the Blue and Red Rules systematically.

Prim’s Algorithm (1.2.1)

Prim’s algorithm is a greedy algorithm that builds an MST by iteratively adding the cheapest edge that connects a vertex in the growing tree to a vertex outside the tree. It is a direct implementation of the Blue Rule.

Algorithm: Prim’s Algorithm

Algorithm: PrimMST(G, c)
Input: Connected graph G = (V, E), cost function c
Output: Minimum spanning tree T = (V, ET)
 
Choose an arbitrary start vertex v ∈ V
S = {v}  // Set of vertices in the growing tree
ET = Ø // Set of edges in the MST
 
while S ≠ V:
    Find a minimum cost edge e^* = {u, x} with u ∈ S and x ∈ V \ S
    Color e^* blue
    Add e^* to ET
    Add vertex x to S
 
return T = (V, ET)

Prim’s algorithm maintains a set $S$ of vertices that are already part of the growing MST. In each step, it selects the cheapest edge that connects a vertex in $S$ to a vertex outside $S$ , adds this edge to the MST, and adds the new vertex to $S$ . This process continues until all vertices are in $S$ .

Efficiency of Prim’s Algorithm: A naive implementation would require $O (∣ V ∣ \cdot ∣ E ∣)$ time. However, using a priority queue to efficiently find the minimum cost edge in each iteration, the runtime can be improved to $O (∣ E ∣ lo g ∣ V ∣)$ or even $O (∣ E ∣ + ∣ V ∣ lo g ∣ V ∣)$ using Fibonacci heaps.

Kruskal’s Algorithm (1.2.2)

Kruskal’s algorithm is another greedy algorithm for finding MSTs, but it takes a different approach. Instead of growing a tree from a starting vertex, Kruskal’s algorithm considers all edges in increasing order of cost and adds an edge to the MST if it does not create a cycle. This algorithm is based on the Red Rule.

Algorithm: Kruskal’s Algorithm

Algorithm: KruskalMST(G, c)
Input: Connected graph G = (V, E), cost function c
Output: Minimum spanning tree T = (V, ET)
 
Sort edges E in non-decreasing order of cost: e1, e2, ..., em
ET = Ø  // Initialize MST edge set
Initialize Union-Find data structure for vertices V
 
for each edge ei = {x, y} in sorted order:
    if Find(x) ≠ Find(y): // Check if adding ei creates a cycle
        Color ei blue
        Add ei to ET
        Union(x, y) // Merge components of x and y
 
return T = (V, ET)

Kruskal’s algorithm starts with an empty graph and iteratively adds edges in increasing order of cost, as long as adding the edge does not create a cycle. Cycle detection is efficiently performed using a Union-Find data structure, which keeps track of connected components.

Efficiency of Kruskal’s Algorithm: Sorting the edges takes $O (∣ E ∣ lo g ∣ E ∣)$ time. The for loop iterates through all edges, and each Union-Find operation takes nearly constant time (amortized). Therefore, the overall runtime of Kruskal’s algorithm is dominated by the sorting step, resulting in a time complexity of $O (∣ E ∣ lo g ∣ E ∣) = O (∣ E ∣ lo g ∣ V ∣)$ .

Excursion: Shannon’s Switching Game (1.2.3)

The section concludes with an excursion into a game called Shannon’s Switching Game, which provides a playful context to deepen our understanding of spanning trees and related concepts. This game involves two players, Maker and Breaker, who color edges of a graph. Maker wins if they can create a spanning tree using their colored edges, while Breaker wins if they can prevent this. The game provides a concrete setting to explore strategic aspects of graph connectivity and spanning structures.

Theorem 1.22: For a graph $G = (V, E)$ , Maker has a winning strategy in Shannon’s Switching Game if and only if $G$ contains two edge-disjoint spanning trees.

This theorem beautifully connects the game-theoretic concept of a winning strategy to a graph-theoretic property-the existence of edge-disjoint spanning trees-highlighting the interplay between combinatorial games and graph theory.

Prev: 01 Fundamentals, Connectivity | Next: 03 Shortest Paths

CS Notes

Explorer