12 Cheapest Path Problem, Negative Cost, Bellman-Ford, Minimum Spanning Trees, Boruvka's Algorithm, Prim's Algorithm, Priority Queues

Lecture from: 28.11.2024 | Rui Zhangs Notes | Video: Videos ETHZ

The Cheapest Path Problem

Recall from last time…

The cheapest path problem aims to find the path with the minimum total cost between two vertices in a graph. Costs are associated with edges, representing distances, time, or other metrics. We restrict ourselves to non-negative edge costs in this part, which allows for efficient algorithms. This problem has numerous applications in areas like navigation, network routing, and transportation planning.

Definitions and Assumptions

Weighted Directed Graph: We work with a weighted directed graph $G = (V, E)$ , where $V$ is the set of vertices and $E$ is the set of directed edges. Each edge $(u, v) \in E$ has a non-negative weight (cost) denoted by $c (u, v) \geq 0$ .
Cheapest Path Cost: The cost of the cheapest path from vertex $u$ to vertex $v$ is denoted by $d (u, v)$ . If no path exists, $d (u, v) = \infty$ .

Recurrence Relation for Cheapest Paths

A fundamental concept for solving the cheapest path problem is the recurrence relation, which expresses the cheapest path cost recursively:

d (u, v) = ⎩ ⎨ ⎧ 0 min_{(u, w) \in E} {c (u, w) + d (w, v)} \infty if u = v if u \neq = v if there is no path from u to v

This states:

The cheapest path from a vertex to itself has cost 0.
The cheapest path from $u$ to $v$ (if $u \neq = v$ ) is found by considering all neighbors $w$ of $u$ , and choosing the neighbor that minimizes the sum of the edge cost $c (u, w)$ and the cheapest path cost from $w$ to $v$ .
If no path exists, the cost is infinite.

Implicit Ordering and Avoiding Circularity

This recurrence appears circular, as it requires knowing $d (w, v)$ to compute $d (u, v)$ . However, algorithms like Dijkstra’s address this by implicitly considering vertices in increasing order of their distances from the source vertex. This ordering ensures that by the time we compute $d (u, v)$ , the values $d (w, v)$ for relevant predecessors $w$ have already been correctly determined.

To make this clearer, let’s redefine the recurrence relation in terms of a set $S$ containing vertices whose shortest distances from the source $s$ are known. We use $d_{S} (v)$ to denote the shortest distance from $s$ to $v$ , considering only paths using vertices in $S \cup {v}$ .

d_{S} (v) = ⎩ ⎨ ⎧ 0 d (s, v) min_{u \to v, u \in S} {d (s, u) + c (u, v)} \infty if v = s if v \in S if v \in / S and \exists u \in S with (u, v) \in E otherwise

This revised recurrence solves the circularity issue. The algorithm’s strategy is to iteratively expand $S$ by adding the vertex with the smallest $d_{S}$ value among those not yet in $S$ . This ensures that when computing $d_{S} (v)$ , all relevant $d (s, u)$ values are already known.

Proof of Correctness of the Recurrence with Implicit Ordering (Non-Negative Costs)

Recall from last time…

We prove that by maintaining the set $S$ and adding vertices in non-decreasing order of their tentative distances, the algorithm correctly computes the shortest path distances.

Theorem: When the algorithm selects a vertex $v^{*}$ with the minimum $d_{S} (v^{*})$ to add to $S$ , the tentative distance $d_{S} (v^{*})$ equals the true shortest path distance $d (s, v^{*})$ .

Proof by Contradiction:

Assumption: Assume there exists a cheaper path $W$ from $s$ to $v^{*}$ with cost $c (W) < d_{S} (v^{*})$ .
Path Decomposition: Since $v^{*} \in / S$ , path $W$ must leave $S$ at some point. Let $u$ be the last vertex in $S$ along $W$ , and let $v$ be the first vertex outside $S$ along $W$ . (Possibly $v = v^{*}$ ).
Cost Analysis: The cost of $W$ is: $c (W) = d (s, u) + c (u, v) + d (v, v^{*})$ . Due to non-negative edge costs, $d (v, v^{*}) \geq 0$ . Thus, $c (W) \geq d (s, u) + c (u, v) \geq d_{S} (v)$ .

Because $d_{S} (v^{*})$ was minimal we know that $d_{S} (v^{*}) \leq d_{S} (v)$ and because we assumed there was a shorter path $W$ we get: $d_{S} (v^{*}) > c (W) \geq d_{S} (v)$ . That’s a contradiction since $d_{S} (v^{*})$ is the minimal cost among reachable nodes not in $S$ .
Contradiction: This contradicts the minimality of $d_{S} (v^{*})$ , as $d_{S} (v^{*})$ should be the smallest distance among vertices not in $S$ .

Therefore, our assumption must be false, proving that $d_{S} (v^{*}) = d (s, v^{*})$ .

Negative Costs and Bellman-Ford Algorithm

As discussed, negative edge costs pose significant challenges for shortest path algorithms. Dijkstra’s algorithm fails in their presence. In this section, we delve deeper into the problem of general edge costs, exploring a dynamic programming approach.

Core Idea: Path Length as Ordering Criterion

The key idea is to organize the computation based on the length (number of edges) of the paths. We’ll iteratively compute the shortest path distances considering paths of increasing lengths. This approach avoids the pitfalls of negative cycles by explicitly limiting the path length. Note that this dynamic programming approach will compute the correct shortest path if one exists. It will, however, not terminate if there exists a negative cycle reachable from the source.

Bellman-Ford (MIT)

For a intuitive and simple explanation watch this…

l-Gute Schranken (l-Good Bounds)

The values $d_{l} (s, v)$ represent upper bounds on the true shortest path distances. After the $l$ -th iteration, $d_{l} (s, v)$ represents the shortest path distance from source vertex $s$ to vertex $v$ using paths with at most $l$ edges. These are called l-good bounds. They are “good” because they are guaranteed to be valid upper bounds; they may not be the shortest path, but we know the actual shortest path cannot be longer.

Let’s illustrate this with our example graph containing negative edge costs.

$l = 1$ : We consider paths of length 1 (i.e., direct edges from the source). For the top-most node $v$ , we find a path of cost 10. This is an l-good bound (an upper bound) because there might exist a shorter path with more than one edge. For the bottom-middle node, we also find a path with cost 100. Since there are no other paths to this node using only one edge, this l-good bound of 100 is also the shortest path using at most one edge. After this first iteration ( $l = 1$ ), our set $S_{1}$ would contain the source node, the top-most node $v$ with $d_{1} (s, v) = 10$ , and the bottom-middle node with $d_{1} (s, bottom-middle) = 100$ .

Improving Bounds: From $(l - 1)$ -Good to $l$ -Good

We iteratively improve the l-good bounds, $d_{l} (s, v)$ , representing the shortest path from $s$ to $v$ using at most $l$ edges.

A shortest path to $v$ with at most $l$ edges either:

Uses at most $l - 1$ edges: Then $d_{l} (s, v) = d_{l - 1} (s, v)$ .
Uses exactly $l$ edges: It’s a shortest path of length $l - 1$ to a neighbor $u$ , plus edge $(u, v)$ .

Formalization

For each vertex $v$ :

d_{l} (s, v) = min {d_{l - 1} (s, v), u : (u, v) \in E min {d_{l - 1} (s, u) + c (u, v)}}

We choose the shorter path: the existing $d_{l - 1} (s, v)$ or a path through a neighbor using $l$ edges.

Starting with 0-good bounds (only the source at distance 0), we iterate for $l = 1, 2, \dots, ∣ V ∣ - 1$ , applying the equation. Each step refines our bounds. After $∣ V ∣ - 1$ iterations, we have the true shortest path distances, assuming no negative cycles. This works due to the principle of optimality: optimal solutions are built from optimal subproblems. If a negative cycle is reachable, the algorithm won’t terminate as distances can decrease indefinitely A further iteration after $∣ V ∣ - 1$ steps would reveal such a cycle.

Example: Iteratively Building Shortest Paths

Let’s illustrate the iterative improvement of bounds with a concrete example.

0-Good Bounds (Initialization)

Initially, we have 0-good bounds. Only the source node (A) has a distance of 0; all others are initialized to infinity, as they are not reachable with 0 edges.

Visually:

1-Good Bounds (First Iteration)

In the first iteration, we calculate 1-good bounds, considering paths of length at most 1. We examine direct neighbors of the source (A).

In tabular form:

2-Good Bounds and Beyond (Subsequent Iterations)

In the second iteration (calculating 2-good bounds), we consider paths of length at most 2. In this specific example, no further improvements are possible. The shortest paths to all reachable nodes have already been found in the previous iteration.

In general, we would continue iterating until $∣ V ∣ - 1$ (where $∣ V ∣$ is the number of vertices), as a simple path without cycles can have at most $∣ V ∣ - 1$ edges. This ensures that, in the absence of negative cycles, the algorithm has explored all possible shortest paths.

Bellman-Ford Algorithm

The Bellman-Ford algorithm computes shortest paths in a weighted directed graph, even with negative edge weights. It also detects negative cycles. If a negative cycle is reachable from the source, the algorithm reports it, as shortest paths are undefined in that case.

Algorithm Description

Initialization:
- Set the distance from the source vertex $s$ to itself to 0: $d (s, s) = 0$ .
- Set the distance from the source to all other vertices to infinity: $d (s, v) = \infty$ for all $v \neq = s$ . This indicates that initially, we don’t know any paths to these vertices.
Iteration (Relaxation):
- Relax all edges in the graph $∣ V ∣ - 1$ times, where $∣ V ∣$ is the number of vertices.
- To “relax” an edge $(u, v)$ means to check if going from $s$ to $u$ and then directly to $v$ is cheaper than the current known path to $v$ . If it is, update $d (s, v)$ :
  - If $d (s, u) + c (u, v) < d (s, v)$ , update $d (s, v) = d (s, u) + c (u, v)$ .
Negative Cycle Detection:
- Perform one more iteration of relaxations.
- If any distance $d (s, v)$ can still be reduced, a negative cycle reachable from $s$ exists.

Pseudocode

BellmanFord(G, s):
  Initialize distances: d[s] = 0, d[v] = ∞ for all v ≠ s
 
  for i = 1 to |V| - 1:  // Main relaxation loop
    for each edge (u, v) in E:
      if d[u] + c(u, v) < d[v]:
        d[v] = d[u] + c(u, v)
 
  for each edge (u, v) in E:  // Negative cycle detection
    if d[u] + c(u, v) < d[v]:
      return "Negative cycle detected"
 
  return d // Shortest path distances

Example:

def bellman_ford(graph, source):
    distances = {vertex: float('infinity') for vertex in graph}
    distances[source] = 0
 
    for _ in range(len(graph) - 1):
        for u in graph:
            for v, weight in graph[u].items():
                if distances[u] != float('infinity') and distances[u] + weight < distances[v]:
                    distances[v] = distances[u] + weight
 
    for u in graph:
        for v, weight in graph[u].items():
            if distances[u] != float('infinity') and distances[u] + weight < distances[v]:
                print("Graph contains negative weight cycle")
                return
 
    return distances

Runtime Analysis: Deriving $O (∣ V ∣ * ∣ E ∣)$

Let $∣ V ∣$ be the number of vertices (nodes) and $∣ E ∣$ be the number of edges.

Initialization: Setting initial distances takes $O (∣ V ∣)$ time. We visit each vertex once.
Relaxation: The outer loop runs $∣ V ∣ - 1$ times. The inner loop iterates over all edges. Relaxing an edge takes constant time ( $O (1)$ ).
- Therefore, the relaxation phase has a time complexity of $O (∣ V ∣ - 1) \cdot ∣ E ∣) = O (∣ V ∣ \cdot ∣ E ∣)$ . Since $∣ V ∣ - 1$ is in the same order of growth as $∣ V ∣$
Negative Cycle Detection: This step iterates over all edges once, taking $O (∣ E ∣)$ time.
Total Runtime: The dominant term is $O (∣ V ∣ \cdot ∣ E ∣)$ . The initialization and negative cycle detection steps are less significant in terms of asymptotic complexity.
Considering Sparse Graphs ( $m \approx n - 1$ ): If the number of edges $∣ E ∣$ is approximately equal to the number of vertices $∣ V ∣$ minus 1 (which is the case in, for example, trees or minimally connected graphs), then $∣ E ∣ \approx ∣ V ∣ - 1$ . Substituting this into the runtime complexity, we get:
$O (∣ V ∣ \cdot ∣ E ∣) \approx O (∣ V ∣ \cdot (∣ V ∣ - 1)) = O (∣ V ∣^{2})$
In this specific case where $m$ is close to $n - 1$ (a sparse graph), then the time complexity becomes quadratic.

In summary, the overall time complexity of the Bellman-Ford algorithm is $O (∣ V ∣ * ∣ E ∣)$ . In the specific case of a sparse graph where $∣ E ∣ \approx ∣ V ∣$ , the runtime simplifies to $O (∣ V ∣^{2})$ .

Minimum Spanning Trees

The Elektrifizierung Problem (1926)

In 1926, Otakar Borůvka tackled the challenge of designing a cost-efficient electrical network for Moravia. The problem involved connecting cities with minimal wiring costs while ensuring all cities were connected This is known as the minimum spanning tree (MST) problem in graph theory.

Modeling the Problem as an Undirected Graph:

The electrical network can be represented as an undirected, weighted graph:

Vertices (Nodes): Represent the cities.
Edges: Represent potential power lines between cities.
Weights: Represent the cost of constructing a power line between two cities (e.g., distance, materials).

Crucially, the problem can be simplified. The source of electricity is irrelevant since we seek to connect all cities. We can conceptually treat all power sources as a single source. This allows us to model the network with an undirected graph, as the direction of power flow doesn’t affect the overall cost of connecting the cities.

Goal: Minimum Spanning Tree

The objective is to find a connected graph of minimal total edge weight that spans all cities. This, by definition, is a minimum spanning tree (MST): a subset of edges that connects all vertices without any cycles and with the minimum possible total edge weight. This tree represents the most cost-effective way to electrify the region, ensuring every city is connected to the power grid.

Minimum Spanning Trees (MSTs)

Greedy Algorithms: Minimum Spanning Tree (MIT)

A good explanation…

Given a connected, undirected graph $G = (V, E)$ with non-negative edge weights, a minimum spanning tree (MST) is a subgraph $T = (V, E_{T})$ that fulfills the following conditions:

Connects all vertices (Spanning): Every vertex in $V$ must be included in the MST. This ensures that all nodes in the original graph are reachable within the tree.
Is acyclic: $T$ must not contain any cycles. If a cycle existed, we could remove any edge from the cycle, and the graph would remain connected but with a lower total weight, contradicting the minimality requirement of an MST. Therefore, acyclicity is essential for minimality.
Has minimum total weight: The sum of the weights of the edges in $E_{T}$ must be the smallest possible among all spanning trees of $G$ . Formally, if $w (e)$ represents the weight of edge $e$ , then:
$w (T) = e \in E_{T} \sum w (e)$
must be minimized. This condition ensures that the chosen set of edges provides the least costly way to connect all vertices.

In summary, a subset $E_{T} \subseteq E$ forms an MST if the graph $T = (V, E_{T})$ is connected and acyclic, and $w (T)$ is minimal among all possible connected, acyclic subgraphs that span $V$ . Any subgraph satisfying these properties represents a valid solution to the minimum spanning tree problem.

Motivation: Clustering

Finding a Minimum Spanning Tree (MST) has a direct application in clustering. An MST inherently groups vertices into a single connected component based on proximity, represented by edge weights. This creates a cluster where all vertices are linked with the minimal possible total cost. Furthermore, strategically removing edges from the MST, starting with the heaviest ones, allows us to divide the graph into smaller clusters, effectively creating a hierarchical clustering structure.

Hierarchical Clustering with MSTs

The process of cutting an MST to form clusters is based on a simple yet powerful idea: the heavier an edge in the MST, the less “similar” the two vertices it connects are likely to be. Therefore, by removing heavy edges, we disconnect less similar vertices and partition the graph into more cohesive clusters.

Procedure:

Compute the MST: Start by finding the MST of the graph using an algorithm like Borůvka’s or Prim’s.
Edge Removal: Remove edges from the MST one at a time, in decreasing order of weight. After each removal, the MST will be split into one more connected component. Each connected component represents a cluster.
Hierarchy: The order in which edges are removed creates a hierarchy of clusters. Initially, all vertices are in a single cluster (the MST). As edges are removed, this cluster is successively divided into smaller and smaller clusters. This process can be visualized as a dendrogram, a tree-like diagram that illustrates the hierarchical relationship between clusters.

Example: Applying to the Elektrifizierung Problem

Let’s revisit the electrification problem discussed above. After computing the MST, imagine that the most expensive edge connects cities that are geographically far apart or separated by difficult terrain. Removing this edge would divide the network into two sub-networks, each representing a more localized cluster of cities. Further edge removals would continue to break down the network into smaller, more tightly connected clusters. This could inform decisions about building more localized power distribution infrastructure. For example if a link to a certain city is extremely costly, it might be more reasonable to build a separate local power plant there.

Conclusion

Using MSTs for clustering provides a simple and intuitive method for grouping similar data points. By leveraging the structure of the MST and the weights of its edges, we can create a hierarchical representation of the data, revealing relationships at different levels of granularity. This technique is particularly useful when the data has a natural notion of distance or similarity that can be encoded in edge weights. The connection between MSTs and clustering further highlights the versatility of graph algorithms in solving a wide range of data analysis problems.

Now let us develop the algorithm step by step…

Distinct Edge Weights: A Simplifying Assumption

Before diving into the (Borůvka’s) algorithm, we introduce a simplifying assumption: all edge weights in the graph are distinct. This assumption simplifies the analysis and correctness proofs of MST algorithms. While real-world scenarios might not always satisfy this assumption, algorithms designed under this constraint often work correctly in the general case or can be adapted with tie-breaking rules. If edge weights are not distinct, a deterministic tie-breaking rule (e.g., based on vertex IDs) can be used to ensure that the algorithm behaves as if the edge weights were distinct.

Safe Edges: Guaranteed Inclusion in MST

A safe edge is an edge that is guaranteed to be part of some Minimum Spanning Tree (MST). Identifying safe edges allows us to build the MST incrementally by adding these edges without the risk of creating cycles or violating the MST properties.

Identifying Safe Edges: Building Blocks of MST Algorithms

The core idea behind efficient MST algorithms is to identify and add “safe edges”-edges guaranteed to be part of some MST-iteratively. Let’s explore two crucial concepts that help us pinpoint these edges:

The Cut Property: Dividing to Conquer

Imagine dividing the graph’s vertices into two disjoint sets, creating a “cut.” The Cut Property states that the cheapest edge crossing this cut must belong to at least one MST.

Intuition: Think of it like building bridges between two islands. To minimize cost, you’d always choose the cheapest bridge available, regardless of how other bridges connect within each island. This cheapest bridge represents a safe edge. No matter how you connect the cities on each side of the cut, including this cheapest cross-cut edge will always be part of an overall cheapest solution.

A locally minimal edge is an edge that’s the cheapest connection between a vertex and any vertex outside its immediate “neighborhood” (defined by a specific set of vertices). It’s a more localized version of the Cut Property.

Definition: An edge $(u, v)$ is locally minimal if there’s a set of vertices $S$ containing $u$ but not $v$ (or vice-versa), such that $(u, v)$ is the cheapest edge connecting $S$ to the rest of the graph.

Theorem: Every locally minimal edge is a safe edge.

Proof (by contradiction):

Assume a locally minimal edge $(u, v)$ is not in any MST.
Consider any MST $T$ . Adding $(u, v)$ to $T$ would create a cycle.
This cycle must contain another edge $(x, y)$ crossing the cut defined by the set $S$ that made $(u, v)$ locally minimal.
Since $(u, v)$ is locally minimal, $w (u, v) < w (x, y)$ .
Replacing $(x, y)$ with $(u, v)$ in $T$ would create a new spanning tree with a lower total weight, contradicting the assumption that $T$ was an MST.

Therefore, $(u, v)$ must be in some MST, making it a safe edge.

Building the MST: An Iterative Approach

Armed with the concept of safe edges, we can construct an MST iteratively:

Initialization: Begin with an empty set of edges.
Iteration: Repeatedly find and add safe edges to this set until a spanning tree is formed.

This simple framework underpins several specific MST algorithms, each differing in how they efficiently identify safe edges. Borůvka’s algorithm, which we’ll explore next, provides a concrete example of this iterative approach. By focusing on finding and adding safe edges, we guarantee that the final set of edges will form a minimum spanning tree. The different algorithms essentially provide efficient strategies for discovering these safe edges.

Borůvka’s Algorithm: Connecting Components Efficiently

Borůvka’s algorithm, one of the earliest MST algorithms, provides an elegant and inherently parallel approach to constructing a minimum spanning tree. It operates by iteratively merging connected components through the selection of safe edges.

Algorithm Description

Initialization:
- Begin with an empty set of edges $F$ to represent the future MST.
- Initially, each vertex in the graph is treated as its own separate connected component. Think of these as isolated islands that we want to connect with bridges.
Iteration: As long as there’s more than one component:
- Find Cheapest Outgoing Edges: For each component, identify the cheapest edge connecting it to any other component. This is like finding the most affordable bridge from each island to a neighboring island. Use a deterministic tiebreaker (e.g., lowest vertex ID) if edge weights are not unique.
- Merge Components: Add these cheapest edges to the set $F$ . Merge the connected components joined by these edges. Note that some edges might be chosen by multiple components-only add each edge once to avoid cycles. This step is like building the selected bridges, thus merging islands into larger landmasses.
Termination: The algorithm stops when all vertices belong to a single connected component-a single “super-island.” The set of edges $F$ now constitutes the MST.

Pseudocode

Boruvka(G):
    F = ∅  # Set of MST edges
    Components = {{v} for v in V} # Initially, each vertex is its own component
 
    while |Components| > 1:
        SafeEdges = ∅
        for each component C in Components:
          cheapestEdge = findCheapestEdge(C, G) # finds the edge with minimum weight connecting C
          # to another component. Returns None if it doesn't exist.
          if cheapestEdge is not None:
            SafeEdges.add(cheapestEdge)
 
        for edge (u,v) in SafeEdges:
            Components = mergeComponents(Components, u, v) # Merges components containing u and v
            if (v,u) not in F: # prevent duplicate edges
                F = F ∪ {(u, v)}
    return F

Runtime Analysis: $O ((∣ V ∣ + ∣ E ∣) l o g ∣ V ∣)$

Per Iteration: Each iteration requires examining all edges to find the cheapest for each component and merging components, taking $O (∣ V ∣ + ∣ E ∣)$ time with efficient data structures (e.g., using a Union-Find data structure for merging).

Number of Iterations: The number of connected components at least halves with each iteration. This leads to a maximum of $lo g_{2} ∣ V ∣$ iterations.

Total Runtime: Combining these, Borůvka’s algorithm has a time complexity of $O ((∣ V ∣ + ∣ E ∣) lo g ∣ V ∣)$ . This signifies that the runtime grows proportionally to the product of the number of edges and vertices and the logarithm of the number of vertices. It’s important to note that this analysis assumes the use of efficient data structures for managing connected components and finding minimum edges. Without these optimizations, the runtime could be higher.

Prim’s Algorithm: Growing an MST from a Seed

Prim’s algorithm offers a different perspective on constructing MSTs. Instead of merging components, it starts with a single vertex and expands outward, like a growing tree, always maintaining a single connected component. This approach leverages a priority queue for efficiency.

Algorithm Description

Initialization:
- Select an arbitrary starting vertex $s$ as the initial “seed” for the MST.
- Create an empty set $F$ to store the MST edges.
- Initialize a set $S = {s}$ to track vertices currently in the MST.
- Assign a key value key[v] to each vertex $v$ : ∞ if no edge connects s to v, or w(s, v) if edge (s, v) exists. key[v] represents the cheapest known connection cost from v to the growing MST.
- Use a priority queue $Q$ to store all vertices, prioritized by their key values (lowest cost at the front).
Iteration: While the priority queue $Q$ is not empty:
- Select and Add: Extract the vertex $u$ with the minimum key value from $Q$ . This vertex is the cheapest to connect to the current MST. Add $u$ to the set $S$ .
- Update Neighbors: For each neighbor $v$ of $u$ that’s not in $S$ :
  - If w(u, v) (the weight of the edge from $u$ to $v$ ) is less than key[v], update key[v] to w(u, v) and update $v$ ‘s priority in $Q$ . This step essentially discovers potentially cheaper connections to vertices outside the current MST. Crucially, if a cheaper edge to v is found, it implies that the previously known cheapest edge to v cannot be part of the MST.
Termination: When $Q$ is empty, all vertices are in $S$ , and the implicitly chosen edges (tracked via updates to the key values and a parent dictionary in implementations) form the MST.

Prim algorithm using min-heap:

Pseudocode

import heapq
 
def prim(graph, start):
  mst_set = set()
  key = {v: float('inf') for v in graph}
  parent = {v: None for v in graph} #store the MST
 
  key[start] = 0
  priority_queue = [(0,start)]
 
  while priority_queue:
    _, u = heapq.heappop(priority_queue)
    if u in mst_set: #was already processed, skip
        continue
    mst_set.add(u)
 
    for v, weight in graph[u].items():
      if v not in mst_set and weight < key[v]:
          key[v] = weight
          parent[v] = u
          heapq.heappush(priority_queue, (key[v], v))
 
  return parent

Runtime Analysis: Efficient with Priority Queues

Using a binary heap for the priority queue, Prim’s algorithm has a runtime of $O ((∣ V ∣ + ∣ E ∣) lo g ∣ V ∣)$ , equivalent to Dijkstra’s algorithm. A Fibonacci heap could theoretically achieve $O (∣ E ∣ + ∣ V ∣ lo g ∣ V ∣)$ , but binary heaps are often more practical.

Multithreaded example

Priority Queue (e.g. Min Heap)

A priority queue is an abstract data type that operates like a regular queue, but where each element has a priority associated with it. Elements are dequeued (removed) based on their priority, with the highest priority element always removed first. In a min-heap priority queue, the element with the smallest value (or key) has the highest priority. This makes it ideal for algorithms like Prim’s that need to repeatedly find and extract the element with the minimum key.

See Heap Sort

Heaps and Heap Sort (MIT)

Key Operations

make(nodes, key): Initialise all nodes with the key of key
insert(item, priority): Adds an item with the given priority to the queue.
extract_min(): Removes and returns the item with the highest priority (lowest key).
decrease_key(item, new_priority): Decreases the priority (key) of a specific item. This is crucial for Prim’s algorithm to update the cost of reaching a vertex when a cheaper path is found. It’s important that this operation efficiently updates the internal structure of the priority queue to maintain the heap property.

Common Implementations

Binary Heap: A binary tree that satisfies the heap property (the value of each node is less than or equal to the value of its children). Binary heaps provide efficient $O (lo g n)$ time complexity for insert, extract_min, and decrease_key operations, where $n$ is the number of elements in the queue. This logarithmic performance is crucial for the overall efficiency of algorithms like Prim’s. In practical implementations, decrease_key often involves knowing the location of the element in the heap. This can be achieved by storing a mapping (e.g., a dictionary) from elements to their positions in the heap.
Fibonacci Heap: A more complex data structure that offers theoretically better performance, particularly for decrease_key, which takes amortized $O (1)$ time. However, the constant factors associated with Fibonacci heaps often make binary heaps a more practical choice for most applications.

Importance in Prim’s Algorithm

The priority queue is essential for Prim’s algorithm to efficiently select the vertex with the minimum key in each iteration. The extract_min operation allows quick retrieval of the next vertex to add to the MST, and decrease_key enables efficient updates of the connection costs as the MST grows. Without a priority queue, finding the minimum key would require scanning all vertices, significantly increasing the algorithm’s runtime. The efficient logarithmic operations of a binary heap priority queue are key to achieving the overall $O ((∣ V ∣ + ∣ E ∣) lo g ∣ V ∣)$ time complexity of Prim’s algorithm.

Continue here: 13 Kruskal’s Algorithm, Union Find, Self-Adapting Data Structures and Amortized Analysis

CS Notes

Explorer

12 Cheapest Path Problem, Negative Cost, Bellman-Ford, Minimum Spanning Trees, Boruvka's Algorithm, Prim's Algorithm, Priority Queues

The Cheapest Path Problem

Definitions and Assumptions

Recurrence Relation for Cheapest Paths

Implicit Ordering and Avoiding Circularity

Proof of Correctness of the Recurrence with Implicit Ordering (Non-Negative Costs)

Negative Costs and Bellman-Ford Algorithm

Core Idea: Path Length as Ordering Criterion

l-Gute Schranken (l-Good Bounds)

Improving Bounds: From (l−1)-Good to l-Good

Formalization

Example: Iteratively Building Shortest Paths

0-Good Bounds (Initialization)

1-Good Bounds (First Iteration)

2-Good Bounds and Beyond (Subsequent Iterations)

Bellman-Ford Algorithm

Algorithm Description

Pseudocode

Runtime Analysis: Deriving O(∣V∣∗∣E∣)

Minimum Spanning Trees

The Elektrifizierung Problem (1926)

Minimum Spanning Trees (MSTs)

Motivation: Clustering

Hierarchical Clustering with MSTs

Example: Applying to the Elektrifizierung Problem

Conclusion

Distinct Edge Weights: A Simplifying Assumption

Safe Edges: Guaranteed Inclusion in MST

Identifying Safe Edges: Building Blocks of MST Algorithms

The Cut Property: Dividing to Conquer

Locally Minimal Edges: A Refinement of the Cut Property

Building the MST: An Iterative Approach

Borůvka’s Algorithm: Connecting Components Efficiently

Algorithm Description

Pseudocode

Runtime Analysis: O((∣V∣+∣E∣)log∣V∣)

Prim’s Algorithm: Growing an MST from a Seed

Algorithm Description

Runtime Analysis: Efficient with Priority Queues

Priority Queue (e.g. Min Heap)

Key Operations

Common Implementations

Importance in Prim’s Algorithm

Table of Contents

Graph View

Improving Bounds: From $(l - 1)$ -Good to $l$ -Good

Runtime Analysis: Deriving $O (∣ V ∣ * ∣ E ∣)$

Runtime Analysis: $O ((∣ V ∣ + ∣ E ∣) l o g ∣ V ∣)$