Lecture from: 18.02.2025 | Video: HomeLab
Introduction: The Importance of Probability
Probability plays a vital role in various scientific disciplines, including quantum physics and biology (e.g., evolution, mutation modeling).
Probability in Computer Science:
- Algorithmic Efficiency: Probability can be leveraged to design faster and simpler algorithms.
- Symmetry Breaking: Randomized algorithms can exhibit different behaviors in situations where deterministic algorithms might struggle, such as resolving contention in network protocols (like Wi-Fi).
- Input Uncertainty: Probability is fundamental for dealing with uncertainty in input data – a core concept in statistics (although not the focus of this course).
Examples of Probabilistic Algorithms
Dictionaries
- Operations:
insert
,delete
,search
. - Naive Implementation: time complexity (e.g., linear search in an unsorted array).
- Data Structures (Previous Semester): time complexity using balanced search trees (e.g., AVL trees, Red-Black trees).
- This Semester: expected time complexity using hash tables (with appropriate randomization and assumptions).
Primality Testing
- Requirement: Determining if a number is prime in polynomial time.
- Randomized Algorithms (Since 1977): Simple, efficient, and practical probabilistic algorithms exist (e.g., Solovay-Strassen, Miller-Rabin). These algorithms have a small probability of error but can be made arbitrarily reliable.
- Deterministic Algorithms: A deterministic polynomial-time algorithm was not known until 2002 (Ask primality test). However, the Ask algorithm, while theoretically significant, has large constant factors, making it less practical than the randomized alternatives in many cases.
The Solovay-Strassen and Miller-Rabin primality tests, developed in the late 1970s, were groundbreaking in demonstrating the power of randomness in algorithm design.
Course Contents
Topics we are going to look at throughout the course:
- Graph Theory and Algorithms: A foundational topic for many computational problems.
- Discrete Probability Theory and Randomized Algorithms: Focus on discrete probability spaces and algorithms that utilize randomness.
- Algorithms - Highlights: A selection of important and illustrative algorithms.
Graph Theory
“Algorithms and Data Structures” (AnD) Review:
- Depth-First Search (DFS) and Breadth-First Search (BFS): Fundamental graph traversal algorithms.
- Euler Tour: A path in a graph that visits every edge exactly once. Conditions exist for a graph to have an Euler tour.
- Minimum Spanning Trees (MSTs): Finding a tree that connects all vertices in a weighted graph with the minimum total edge weight (e.g., Kruskal’s, Prim’s algorithms).
- Shortest Paths: Finding the shortest path between two vertices in a weighted graph (e.g., Dijkstra’s algorithm, Bellman-Ford algorithm).
“Algorithms and Probability” (AnW) Topics:
- Connectedness: Determining if a graph is connected (i.e., there is a path between any two vertices).
- Hamiltonian Circuits: A cycle in a graph that visits every vertex exactly once (unlike an Euler tour, which visits every edge once). Finding Hamiltonian circuits is a notoriously difficult problem (NP-complete).
- Matchings: Finding a subset of edges in a graph such that no two edges share a common vertex. Maximum matching seeks the largest such subset. Perfect matching requires all nodes to be matched.
- Graph Coloring: Assigning colors to vertices of a graph such that no two adjacent vertices have the same color. The goal is often to minimize the number of colors used. The chromatic number of a graph is the minimum number of colors required for a valid coloring.
Connectedness in Graphs
A fundamental property of a graph is its connectedness. This section explores various aspects of connectedness, including k-connectivity and its relationship to vertex-disjoint and edge-disjoint paths.
K-Connectivity (MIT)
k-connected (Vertex Connectivity)
Let be an undirected graph, where represents the set of vertices (nodes) and represents the set of edges.
- Connected Graph: A graph is connected if, for any two distinct nodes , there exists at least one path from to .
We are interested not just in whether a graph is connected, but how robustly it is connected. Vertex connectivity quantifies this robustness by considering how many vertices must be removed to disconnect the graph.
- k-connected Graph (k-vertex-connected): A graph is k-connected (or k-vertex-connected) if:
- It has at least vertices (). (We need at least vertices so that we can remove vertices and still have something left).
- Removing any set of fewer than vertices leaves the remaining graph connected. Formally, for all subsets with , the induced subgraph is connected. is the graph formed by removing the vertices in , along with all edges that had an endpoint in , from the original graph .
Menger’s Theorem (Vertex Version)
Menger’s Theorem provides an alternative, equivalent definition of k-connectivity, linking it to the concept of vertex-disjoint paths:
Theorem: A graph is k-connected if and only if for every pair of distinct vertices (), there exist at least internally vertex-disjoint paths between and .
- Internally Vertex-Disjoint Paths: Paths are internally vertex-disjoint if they share no vertices except for the starting and ending vertices ( and ). In other words, they can only intersect at the endpoints.
Intuition: Menger’s Theorem tells us that a graph is robustly connected (k-connected) if there are multiple independent “routes” between any two nodes. If one or more nodes fail (are removed), alternate routes still exist.
Example
In this example:
- The u and v are 3-connected, but not 4-connected, because making a new disjoint-path would reuse already used nodes.
- Removing any two nodes, will not make u and v disconnected.
k-edge-connected (Edge Connectivity)
Similar to vertex connectivity, we have edge connectivity, which measures how many edges must be removed to disconnect the graph.
- k-edge-connected Graph: A graph is k-edge-connected if removing any set of fewer than k edges leaves the remaining graph connected. Formally, for all subsets with , the graph is connected. This is the graph formed by removing the edges in set X.
Menger’s Theorem (Edge Version)
Menger’s theorem also has an edge-based counterpart:
Theorem: A graph is k-edge-connected if and only if for every pair of distinct vertices, there exist at least k edge-disjoint paths between them.
- Edge-Disjoint Paths: Paths are edge-disjoint if they share no edges. They can share vertices.
Intuition: Edge connectivity measures the redundancy of connections in terms of edges. If there are multiple edge-disjoint paths between any two nodes, the graph can tolerate edge failures.
Example
Take the graph from before:
Here, we might think that u-v are 3-edge connected, but actually we are 4-edge connected. This stems from the fact that we are looking for disjoint sets of edges, nodes don’t matter. Meaning this allows us to construct a new path from u to v.
This shows that removing 3 edges doesn’t suffice to disconnect u and v. The above example also shows us the relationship between connectivity measures.
Relationship between Connectivity Measures
There’s a fundamental relationship between vertex connectivity (), edge connectivity (), and the minimum degree of a graph ():
- Minimum Degree (): The minimum degree of a graph is the smallest degree (number of incident edges) of any vertex in the graph.
Explanation:
-
: Consider a vertex v with the minimum degree, . If we remove all the edges incident to v, then v becomes isolated, disconnecting the graph. Therefore, we can disconnect the graph by removing at most edges, so the edge connectivity cannot be greater than the minimum degree.
-
: If we remove edges and disconnect the graph, then consider the two components that became disconnected. We could also disconnect the graph by removing all vertices in one component that were endpoints of the removed edges. There can be at most such vertices (and potentially fewer, if some removed edges shared endpoints). Therefore, the vertex connectivity cannot be greater than the edge connectivity.
Special Cases and Properties of Connectivity (k=1)
Let’s consider the special case when k=1, which relates to simple connectedness, and explore related concepts like cut vertices and cut edges.
Articulation Vertex (Cut Vertex)
A cut vertex (or articulation vertex) is a vertex in a connected graph whose removal (along with its incident edges) disconnects the graph. In other words, if is a connected graph, and is a cut vertex, then the induced subgraph is not connected. A graph that is 1-connected may have cut vertices, but a graph that is not connected has, by definition, no path between some pairs of nodes, even before any vertices are removed. A graph with -connectivity, where , has no cut vertices.
Articulation Edge (Cut Edge or Bridge)
A cut edge (or bridge) is an edge in a connected graph whose removal disconnects the graph. If is a connected graph, and is a cut edge, then the graph is not connected. Similar to cut vertices, a graph with -edge-connectivity, where , has no cut edges.
Lemma: Relationship between Cut Vertices and Cut Edges
There’s a relationship between cut vertices and cut edges:
Lemma: Let be a connected graph. If an edge is a bridge, then either the degree of is 1 () or is a cut vertex, and the same applies to : either or is a cut vertex. (The converse is not necessarily true).
Proof (Informal):
Suppose is a bridge. If we remove , the graph becomes disconnected into at least two components. Let’s call the component containing as and the component containing as .
-
Case 1: : If has degree 1, then the edge is the only edge connected to . Removing this edge isolates , making the graph disconnected.
-
Case 2: : If has a degree greater than 1, then it has other neighbors besides . Since removing disconnects the graph, all other neighbors of must be in the same component . If we remove itself (and all its incident edges), will be separated from , thus disconnecting the graph. Therefore, is a cut vertex. The same argument applies symmetrically to vertex .
Important Note: The converse is not always true. A cut vertex does not guarantee that all of its incident edges are bridges.
Structure of Graphs: Blocks
We can decompose a graph into “blocks” based on the existence of cycles. This decomposition helps in understanding the graph’s structure and connectivity properties.
Equivalence Relation on Edges:
We define an equivalence relation, denoted by , on the set of edges of a graph as follows:
For two edges , if and only if either:
- (the edges are the same), OR
- There exists a cycle in that contains both and .
Blocks:
The equivalence classes of this relation are called blocks. A block is a maximal subgraph that is 2-edge connected. A block consists of edges.
Intuition: Think of blocks as the “biconnected components” of the graph, but defined in terms of edges. If two edges lie on a common cycle, they are “strongly connected” in the sense that removing either one of them won’t disconnect the graph. A block is a maximal set of edges with this property.
Proof that is an Equivalence Relation
To prove that is an equivalence relation, we must show that it is reflexive, symmetric, and transitive.
-
Reflexive: For any edge , because of condition 1 ().
-
Symmetric: If , then either (in which case ) or there exists a cycle containing both and . The existence of this cycle also implies that .
-
Transitive: This is the least trivial part. If and , we need to show that .
- If or or , the statement follows trivially.
- Otherwise, there exists a cycle containing and , and a cycle containing and . We need to show there’s a cycle containing and . This requires a more involved argument, often involving constructing a new cycle from parts of and (details shown below).
Transitivity of the Equivalence Relation
Recall: if and only if or there exists a cycle containing both and .
To Prove: If and , then .
Proof:
We are given that and . We need to show that .
-
Trivial Cases:
- If , then is the same as , which is given.
- If , then is the same as , which is given.
- If , then holds by reflexivity.
-
Non-Trivial Case: Assume , , and . Since , there exists a cycle containing both and . Since , there exists a cycle containing both and . Our goal is to demonstrate the existence of a cycle containing both and .
Naive Approach (and why it can fail):
A naive approach might be to try and “fuse” cycles and together to form a new cycle containing and . However, this direct fusion doesn’t always work. Consider the following scenario:
- contains edges , , and other edges forming a path from the end of back to the start of , and another path from the end of f back to the start of e.
- contains edges , , and other edges forming a path from the end of back to the start of , and another path from the end of g back to the start of f.
If and share only the edge , we can construct a new cycle by following , then , then the reverse of and g, followed by the reverse of .
However, if and share other vertices or edges besides , simply concatenating parts of and might not result in a simple cycle (it might have self-intersections).
Correct Approach (using Menger’s Theorem implicitly):
-
Construct a Subgraph: Consider the subgraph formed by the union of the edges in and .
-
2-Edge-Connectivity: is 2-edge-connected. This is because removing any single edge from will leave at least one of the cycles or intact, and since these cycles share the edge , the remaining graph will still be connected. Even if the removed edge is , we still have two edge disjoint path via the remaining parts of the cycles.
-
Menger’s Theorem (Edge Version): Since is 2-edge-connected, Menger’s Theorem (edge version) guarantees that there exist at least two edge-disjoint paths between any two vertices in .
-
Paths between Endpoints of e and g: Let and be the endpoints of edge , and let and be the endpoints of edge . Because is 2-edge-connected, there exist two edge-disjoint paths, say and , between and in .
-
Constructing a Cycle:
- We know have two edge-disjoint paths, say and , connecting one endpoint of , to one endpoint of .
- We take edge e, path , edge g and path and we have a closed walk containing the edges e and g.
-
The cycle contains both and . Therefore, .
Conclusion:
The transitivity property holds for the relation . Therefore, is a valid equivalence relation. The key to the proof is leveraging the 2-edge-connectivity of the subgraph formed by the union of the two cycles and applying Menger’s theorem to guarantee the existence of edge-disjoint paths, which are then used to create a cycle containing both and .
Lemma: Intersection of Blocks
Lemma: Two distinct blocks of a graph can intersect in at most one vertex, and that vertex must be a cut vertex.
Proof (Intuitive Explanation):
Suppose two distinct blocks, and , share two or more vertices. Then we could construct a cycle that includes edges from both and . This would mean that the edges in and belong to the same equivalence class (they lie on a common cycle), contradicting the assumption that and are distinct blocks. Therefore, two distinct blocks can share at most one vertex.
If blocks and share a vertex , and we remove , then all paths between edges in and edges are broken. Therefore, removing disconnects the graph, making a cut vertex.
Intuition: Blocks are “tightly knit” parts of the graph connected by cycles. They can only “touch” at single points (cut vertices), which act like “hinges” or “articulation points” connecting the blocks. If they shared more than one vertex, you could “merge” them into a single, larger block.
Block Graph (Block-Cut Tree)
The block graph, also known as the block-cut tree, provides a simplified representation of a connected graph’s structure by highlighting its biconnected components (blocks) and cut vertices.
Definition
Let be a connected graph. The block graph of is a bipartite graph , where:
- is the set of cut vertices of .
- is the set of blocks of .
- is the set of edges in the block graph. There is an edge , where and , if and only if the cut vertex is contained within the block in the original graph .
Key Property (Tree Structure)
Theorem: If is a connected graph, its block graph is a tree.
Why is it a tree?
- Connected: The block graph is connected because the original graph is connected. Every part of is represented in the block graph.
- Acyclic: Assume, for the sake of contradiction, that the block graph contains a cycle. This cycle would have to alternate between block nodes and cut vertex nodes. A cycle implies that there are at least two distinct paths between two nodes in the block graph. This would mean that at least two blocks share two or more vertices. We already have show that this implies that the blocks should be merged, this contradict the blocks. Therefore, the block graph cannot contain a cycle.
Example
The image above shows a graph with:
- Cut Vertices (Red): These are the nodes whose removal would disconnect the graph.
- Blocks (Colored Regions): These are the maximal subgraphs where any two edges lie on a common cycle (or are the same edge). The blue, green, and purple regions each represent a block.
- Grey Nodes: These are nodes which are not cut vertices.
The bottom part of the image depicts the block graph:
- Red Nodes: These correspond to the cut vertices of the original graph.
- Colored Nodes (not red): These correspond to the blocks of the original graph (the blue, green, yellow and purple colored regions).
- Edges: There exists an edge between the non-red nodes, and the red nodes, according to the definition above.
Intuition and Usefulness:
- Simplified Structure: The block graph provides a higher-level, simplified view of the graph’s structure. It shows how the biconnected components (blocks) are connected through the cut vertices.
- Connectivity Analysis: The block graph is useful for analyzing the connectivity properties of a graph. For instance, finding all cut vertices becomes equivalent to finding all degree-1 nodes (leaves) in the block tree that correspond to block.
- Bottlenecks: The cut vertices are where the flow between blocks can get bottlenecked.
Bipartite Nature: The block graph is bipartite because edges connect nodes from set A (cut vertices) to nodes of set B (blocks), meaning nodes withing A and B don’t have edges in between.
Why is this useful? (Example: Navigation)
Consider a navigation application like Google Maps.
- Large Road Networks: Road networks are often very large graphs.
- Bridges and Tunnels: Bridges and tunnels are often represented by cut edges (bridges) in the road network graph. Major intersections or critical road segments that, if blocked, would separate parts of the city, might be represented by cut vertices.
- Route Planning: When planning a route, if a bridge (cut edge) is closed, the navigation system needs to find an alternative route. Knowing the block structure of the graph helps efficiently identify alternative routes that bypass the blocked bridge or cut vertex. The block graph provides a simplified, higher-level view of the network, allowing the algorithm to focus on re-routing around entire blocks rather than individual edges. This can significantly improve the efficiency of route-finding algorithms in the presence of road closures or disruptions.
- Alternative Roads: Knowing connectivity can aid choosing backup routes if the shortest path would be blocked for example.