03 Hamiltonian Cycles, Dirac's Theorem, Complexity Theory, Traveling Salesman

Lecture from: 25.02.2025 | Video: Homelab | Rui Zhangs Notes

Review: Cycles and Circuits

Last time, we discussed:

Eulerian Tours (Eulerian Circuits): Closed walks that traverse every edge of a graph exactly once.
Hamiltonian Cycles: Cycles that visit every vertex of a graph exactly once (and return to the starting vertex).

We established a necessary and sufficient condition for the existence of an Eulerian tour: a connected graph has an Eulerian tour if and only if every vertex has an even degree. We also briefly touched on Hierholzer’s algorithm for constructing an Eulerian tour.

For Hamiltonian cycles, we examined examples like the Icosahedral graph, grid graphs, and n-dimensional hypercubes (in the context of Gray codes). However, we did not discuss an efficient algorithm for finding Hamiltonian cycles in general graphs, because, in general, no such efficient algorithm is known.

Hamiltonian Cycles

While there’s no known efficient algorithm to find Hamiltonian cycles in all graphs, there are some sufficient conditions that guarantee their existence. One important result is Dirac’s Theorem:

Dirac’s Theorem (1952): If $G = (V, E)$ is a graph with $∣ V ∣ = n \geq 3$ vertices, and the minimum degree of any vertex, denoted by $δ (G)$ , satisfies $δ (G) \geq \frac{n}{2}$ , then $G$ has a Hamiltonian cycle.

Important Note: Dirac’s Theorem provides a sufficient condition, but it is not a necessary condition. A graph may have a Hamiltonian cycle even if it does not satisfy the condition $δ (G) \geq \frac{n}{2}$ .

Example: Hypercubes Revisited

Let’s consider an n-dimensional hypercube, $H_{n}$ .

Number of vertices: $∣ V ∣ = 2^{n}$ (each vertex corresponds to an n-bit binary string).
Degree of each vertex: $deg (v) = n$ (each vertex is connected to $n$ other vertices, differing in exactly one bit position).

For Dirac’s condition to hold, we would need $n \geq \frac{2 ^{n}}{2} = 2^{n - 1}$ . This inequality is not true for $n \geq 3$ .

For example, for $n = 3$ (a cube), we have $3 < 2^{3 - 1} = 4$ . However, we know that a 3-dimensional hypercube (a cube) does have a Hamiltonian cycle (we used this for Gray codes). This demonstrates that Dirac’s condition is sufficient but not necessary.

The Principle of Inclusion-Exclusion (Siebformel)

The Principle of Inclusion-Exclusion (PIE), sometimes referred to as “Siebformel” in German, is a counting technique used to determine the cardinality of the union of multiple sets.

For two sets: $∣ A_{1} \cup A_{2} ∣ = ∣ A_{1} ∣ + ∣ A_{2} ∣ - ∣ A_{1} \cap A_{2} ∣$
For three sets: $∣ A_{1} \cup A_{2} \cup A_{3} ∣ = ∣ A_{1} ∣ + ∣ A_{2} ∣ + ∣ A_{3} ∣ - ∣ A_{1} \cap A_{2} ∣ - ∣ A_{1} \cap A_{3} ∣ - ∣ A_{2} \cap A_{3} ∣ + ∣ A_{1} \cap A_{2} \cap A_{3} ∣$
General Formula: For sets $A_{1}, A_{2}, ..., A_{n}$ :

$∣ ⋃_{i = 1}^{n} A_{i} ∣ = \sum_{i = 1}^{n} ∣ A_{i} ∣ - \sum_{1 \leq i < j \leq n} ∣ A_{i} \cap A_{j} ∣ + \sum_{1 \leq i < j < k \leq n} ∣ A_{i} \cap A_{j} \cap A_{k} ∣ - ... + (- 1)^{n - 1} ∣ A_{1} \cap A_{2} \cap ... \cap A_{n} ∣$

Intuition: The PIE corrects for overcounting. We first add the sizes of all individual sets. Then we subtract the sizes of all pairwise intersections (because these elements were counted twice). Then we add back the sizes of all three-way intersections (because these elements were subtracted too many times), and so on.

Proof of Dirac’s Theorem

Dirac’s Theorem: If $G = (V, E)$ is a simple graph with $n$ vertices ( $n \geq 3$ ) and the minimum degree of $G$ , $δ (G)$ , satisfies $δ (G) \geq n /2$ , then $G$ has a Hamiltonian cycle.

The proof has two main parts:

Show that G is connected.
Show that G has a Hamiltonian cycle.

Part 1: The graph is connected

We need to prove that for any two distinct vertices $u, v \in V$ , there exists a path between $u$ and $v$ .

Case 1: $(u, v) \in E$

If $u$ and $v$ are directly connected by an edge, then a path of length 1 exists between them.
Case 2: $(u, v) \in / E$

If $u$ and $v$ are not directly connected, we still need to demonstrate the existence of a path.

Let $N (u)$ be the set of neighbors of $u$ (i.e., the vertices adjacent to $u$ ), and $N (v)$ be the set of neighbors of $v$ . Because the minimum degree is at least $n /2$ , we know that $∣ N (u) ∣ \geq n /2$ and $∣ N (v) ∣ \geq n /2$ .

The core idea is to show that $N (u)$ and $N (v)$ must intersect. If they intersect, there exists a vertex $w$ that is a neighbor of both $u$ and $v$ , and we have a path $u - w - v$ of length 2.

We present two arguments to show this intersection:
- Pigeonhole Principle Argument:
  
  Assume, for the sake of contradiction, that $N (u) \cap N (v) = \emptyset$ (the neighborhoods are disjoint). Then the sets $N (u)$ , $N (v)$ , ${u}$ , and ${v}$ are all mutually disjoint. Therefore, the total number of vertices in the union of these sets is:
  
  $∣ N (u) \cup N (v) \cup {u} \cup {v} ∣ = ∣ N (u) ∣ + ∣ N (v) ∣ + ∣ {u} ∣ + ∣ {v} ∣ \geq \frac{n}{2} + \frac{n}{2} + 1 + 1 = n + 2$
  
  This is a contradiction, because the total number of vertices in the graph is $n$ , and we’ve found at least $n + 2$ distinct vertices. Therefore, our assumption that $N (u) \cap N (v) = \emptyset$ must be false, meaning $N (u)$ and $N (v)$ must have at least one vertex in common.
- Principle of Inclusion-Exclusion Argument:
  
  The Principle of Inclusion-Exclusion states:
  
  $∣ N (u) \cup N (v) ∣ = ∣ N (u) ∣ + ∣ N (v) ∣ - ∣ N (u) \cap N (v) ∣$
  
  We know that $u$ and $v$ are not in $N (u) \cup N (v)$ (because $u$ is not a neighbor of itself, and similarly for $v$ , and we assumed $u$ and $v$ are not neighbors). Therefore, $∣ N (u) \cup N (v) ∣ \leq n - 2$ . Substituting the known values, we get:
  
  $n - 2 \geq ∣ N (u) \cup N (v) ∣ = ∣ N (u) ∣ + ∣ N (v) ∣ - ∣ N (u) \cap N (v) ∣$ $n - 2 \geq \frac{n}{2} + \frac{n}{2} - ∣ N (u) \cap N (v) ∣$ $n - 2 \geq n - ∣ N (u) \cap N (v) ∣$ $∣ N (u) \cap N (v) ∣ \geq 2$
  
  This shows that the neighborhoods of $u$ and $v$ must have at least two vertices in common.
In both cases, we’ve proven that if $u$ and $v$ are not directly connected, there must exist a path of length 2 between them. Therefore, the graph $G$ is connected.

Part 2: Existence of a Hamiltonian Cycle

We will prove Dirac’s Theorem, which provides a sufficient condition for the existence of a Hamiltonian cycle. We use a proof by extremality (considering a maximal structure).

Lemma: If a graph $G$ satisfies Dirac’s condition ( $δ (G) \geq n /2$ ), then for any integer $k$ where $2 \leq k \leq n$ , either:

$G$ contains a cycle of length at least $k + 1$ , or
$G$ contains a cycle of length at least $k$ that includes any given set of $k$ vertices.

Proof of Lemma (Outline):

The proof proceeds by induction on $k$ .

Base Cases For small base cases, say k = 2 and k = 3 it is easy to see this.
Inductive Step: Let $P = (v_{1}, v_{2}, ..., v_{k})$ be a path containing a specific set of $k$ distinct vertices. We consider two cases:
- Case 1: Path Extension: If either $v_{1}$ or $v_{k}$ has a neighbor outside the path $P$ , we can extend the path to include that neighbor, creating a path of length $k + 1$ .
- Case 2: Cycle Formation: If neither $v_{1}$ nor $v_{k}$ has a neighbor outside $P$ , then all their neighbors must be within $P$ . We can show, using Dirac’s condition and the Pigeonhole Principle, that there must exist an index $i$ such that $v_{1}$ is adjacent to $v_{i + 1}$ and $v_{k}$ is adjacent to $v_{i}$ . This allows us to form a cycle of length $k$ : $(v_{1}, v_{i + 1}, v_{i + 2}, ..., v_{k}, v_{i}, v_{i - 1}, ..., v_{1})$ . This can be seen because each vertex must have at least $n /2$ neighbors, which, on a path that contains n vertices, means that there is a high probability of having edges like that.

Proof of Dirac’s Theorem (using Lemma 1):

Connectedness: See part 1 of the proof…
Longest Path: Let $P = (v_{1}, v_{2}, ..., v_{k})$ be a longest path in $G$ .
All Vertices Included: Because G is connected and P is the longest path, all the vertices are included in the path (otherwise the path is not longest).
Applying Lemma 1: Using the $k = n$ value in our lemma, we can see how we will always have a cycle.

Conclusion: Dirac’s Theorem guarantees the existence of a Hamiltonian cycle in a graph if the minimum degree is sufficiently high (at least half the number of vertices). The proof relies on constructing a longest path and showing that it can be closed into a cycle due to the minimum degree condition.

Finding Hamiltonian Cycles Algorithmically: Towards Exponential Time

As we’ve established, determining whether a graph has a Hamiltonian cycle is a computationally difficult problem (NP-complete). There’s no known polynomial-time algorithm to solve it.

Brute-Force Approach (and its inefficiency):

A naive, brute-force approach would be:

Generate all permutations: Generate all possible orderings (permutations) of the vertices of the graph.
Check each permutation: For each permutation, check if it forms a valid Hamiltonian cycle:
- Verify that there’s an edge between consecutive vertices in the permutation.
- Verify that there’s an edge between the last and first vertices.

Time Complexity of Brute-Force:

There are $n!$ (n factorial) permutations of $n$ vertices.
Checking each permutation takes $O (n)$ time (checking for edges between consecutive vertices).
Therefore, the total time complexity of the brute-force approach is $O (n \cdot n!) = O (n!)$ , which is highly inefficient (super-exponential).

Goal: Exponential Time Algorithm ( $O (2^{n})$ ):

Our objective is to develop an algorithm that, while still exponential, is significantly better than the factorial time complexity of the brute-force approach. We aim for an algorithm with a time complexity of $O (2^{n})$ , or more precisely, $O (p o l y (n) \cdot 2^{n})$ , where $p o l y (n)$ is a polynomial function of $n$ . This is a substantial improvement, as exponential time is much smaller than factorial time for large values of $n$ . For example we would get this down to $O (n^{2} 2^{n})$

Hamiltonian Cycles with Dynamic Programming ( $O (n^{2} 2^{n})$ )

We can find Hamiltonian cycles using dynamic programming, achieving a time complexity of $O (n^{2} 2^{n})$ , a significant improvement over the brute-force $O (n!)$ approach.

The Core Idea (Dynamic Programming):

Instead of trying all permutations of vertices, we’ll build up solutions for subsets of vertices. We’ll store information about paths that start at a specific vertex (we’ll choose vertex 1 as our starting point) and visit all vertices in a given subset.

Definitions:

Let $G = (V, E)$ be the graph, with $V = {1, 2, ..., n}$ . We assume vertex 1 is the starting vertex for our potential Hamiltonian cycle.
For any subset $S \subseteq V$ such that $1 \in S$ , and any vertex $x \in S$ where $x \neq = 1$ , we define:
- $P_{S, x} = 1$ if there exists a path that starts at vertex 1, visits exactly the vertices in $S$ , and ends at vertex $x$ .
- $P_{S, x} = 0$ otherwise.

Dynamic Programming Table:

$P_{S, x}$ represents the entries in our dynamic programming table. The table has the following characteristics:

Rows: Represent subsets $S$ of $V$ that include vertex 1.
Columns: Represent vertices $x$ in the subset $S$ (excluding 1).

Initialization:

For all $x \in {2, ..., n}$ , we initialize the base cases:

$P_{{1, x}, x} = {10 if (1, x) \in E otherwise$

This means that there’s a path from vertex 1 to vertex $x$ using only the vertices ${1, x}$ if and only if there’s an edge directly between them.

Recursive Relation (Building the Table):

We build the table iteratively, considering subsets $S$ of increasing size. For each subset $S$ with $∣ S ∣ = s$ (where $s$ ranges from 3 to $n$ ), and for each $x \in S$ ( $x \neq = 1$ ):

$P_{S, x} = max {P_{S ∖ {x}, y} ∣ y \in S, y \neq = 1, y \neq = x, and (y, x) \in E}$

Explanation of the Recursion:

To determine if there’s a path from 1 to $x$ using all vertices in $S$ ( $P_{S, x}$ ), we look at all possible previous vertices $y$ in the subset $S$ (excluding 1 and $x$ ).
We check if there was a path from 1 to $y$ using all vertices in $S ∖ {x}$ (i.e., $P_{S ∖ {x}, y}$ ). This means we had a valid subpath that ended at $y$ .
We also check if there is an edge between $y$ and $x$ ( $(y, x) \in E$ ).
If both conditions are true, then we can extend the path from 1 to $y$ to include $x$ , thus creating a path from 1 to $x$ using all vertices in $S$ .
We take the max because we only need one such $y$ to exist for $P_{S, x}$ to be 1.

Pseudocode:

function hamiltonian_cycle(G):
  n = number_of_vertices(G)
  P = initialize_table(n)  # Initialize a table P[S, x]
 
  # Initialization (Base Cases)
  for x in range(2, n + 1):
    if (1, x) in edges(G):
      P[{1, x}, x] = 1
    else:
      P[{1, x}, x] = 0
 
  # Build the table iteratively
  for s in range(3, n + 1):
    for S in all_subsets(V, size=s, containing=1): # Generate all subsets of size s, containing 1
      for x in S:
        if x != 1:
          P[S, x] = 0
          for y in S:
            if y != 1 and y != x and (y, x) in edges(G):
              P[S, x] = max(P[S, x], P[S - {x}, y])
 
  # Check for Hamiltonian Cycle
  for x in range(2, n + 1):
    if (x,1) in edges(G) and P[{1, 2, ..., n}, x] == 1:
      return True  # Hamiltonian cycle exists
 
  return False  # No Hamiltonian cycle

Final Result:

After filling the table, the graph $G$ contains a Hamiltonian cycle if and only if there exists a vertex $x$ such that $(x, 1) \in E$ and $P_{V, x} = 1$ . This is because the cycle would need to include all vertices.

Time Complexity Analysis:

Number of Subsets: There are $2^{n}$ possible subsets of $V$ . Since we only consider subsets containing vertex 1, the number of relevant subsets is $2^{n - 1} = O (2^{n})$ .
Iterations: For each subset $S$ , we iterate through at most $n$ possible values of $x$ , and for each $x$ , we iterate through at most $n$ possible values of $y$ .
Overall: The time complexity is $O (n^{2} \cdot 2^{n})$ .

Space Complexity:

The table $P$ stores values for each subset $S$ and each vertex $x \in S$ . Therefore, the space complexity is $O (n \cdot 2^{n})$ .

This dynamic programming algorithm provides a significant improvement over the brute-force method, bringing the complexity down from factorial to exponential time.

Dynamic Programming: How to Brand Your Research?

Not important, but funny regardless…

The slide presents a quote from Richard Bellman, a pioneer in dynamic programming, explaining how he came up with the name “dynamic programming”:

The 1950s were not good years for mathematical research. [The Secretary of Defense] had a pathological fear and hatred of the word “research”.

What title, what name, could I choose? […] I wanted to get across the idea that this was dynamic, this was multistage, this was time-varying.

[…] It also has a very interesting property as an adjective, and that is it’s impossible to use the word dynamic in a pejorative sense.

Richard Bellman, Eye of the Hurricane: An Autobiography (1984)

Key Takeaways from the Quote:

Political Climate: In the 1950s, during the Cold War, there was significant government scrutiny and suspicion surrounding research, particularly in mathematics and theoretical fields.
Strategic Naming: Bellman chose the term “dynamic programming” very deliberately. It was not just a descriptive name; it was a strategic choice to avoid negative connotations associated with “research.”
Meaningful Adjectives: He wanted a name that conveyed the key aspects of the technique:
- Dynamic: Highlighting the changing, evolving nature of the problems being solved.
- Multistage: Emphasizing the step-by-step, sequential decision-making process.
- Time-Varying: Reflecting that the problems often involved processes that change over time.
Positive Connotation: Bellman also noted the inherent positivity of the word “dynamic,” making it difficult to use in a negative or critical way. This was a clever way to “sell” his work in a challenging political environment.

In essence, the name “dynamic programming” was as much about marketing and perception as it was about the mathematical technique itself. It was a way to make the research sound practical, useful, and non-threatening to those who held the purse strings.

Eulerian Tours vs. Hamiltonian Cycles: Complexity and Algorithms

Let’s contrast the Eulerian tour and Hamiltonian cycle problems in terms of their complexity and the algorithms used to solve them.

Summary of Results

Eulerian Tour:

Theorem: A connected undirected graph $G = (V, E)$ has an Eulerian tour if and only if the degree of every vertex is even.
Algorithm: Hierholzer’s algorithm finds an Eulerian tour (if one exists) in $O (∣ E ∣)$ time. This is a very efficient, linear-time algorithm.

Hamiltonian Cycle:

Theorem (Informal): Determining whether a graph has a Hamiltonian cycle is a computationally difficult problem (NP-complete).
Algorithm (Dynamic Programming): We can solve the Hamiltonian cycle problem using dynamic programming in $O (∣ V ∣^{2} \cdot 2^{∣ V ∣})$ time. This is an exponential-time algorithm, which is much slower than the linear-time algorithm for Eulerian tours but significantly better than a brute-force approach ( $O (∣ V ∣!)$ ).

The Hamiltonian Cycle Problem: Complexity and Solvability

Decision Problem Given a graph $G = (V, E)$ , does G contain an hamiltonian circle?
We can decide and find a hamiltonian cycle in $O (∣ V ∣^{2} * 2^{∣ V ∣})$ .
Theorem (Karp 1972): The Hamiltonian cycle problem (“Given a graph $G = (V, E)$ , does $G$ contain a Hamiltonian cycle?”) is NP-complete.

Complexity Theory: A Brief Excursion (Preview of Theoretical Computer Science)

The concept of NP-completeness is central to understanding the difference in difficulty between the Eulerian tour and Hamiltonian cycle problems. Let’s briefly introduce some key ideas from complexity theory. (This is a simplified overview; a full treatment is typically covered in a theoretical computer science course.)

Classes of Problems: P and NP

P (Polynomial Time): The class of decision problems (problems with a “yes” or “no” answer) that can be solved by a deterministic algorithm in polynomial time (i.e., the running time is bounded by a polynomial function of the input size, like $O (n)$ , $O (n^{2})$ , $O (n^{3})$ , etc.). Problems in P are considered “efficiently solvable.” The Eulerian Tour problem falls into the P-class.
NP (Nondeterministic Polynomial Time): The class of decision problems for which a “yes” answer can be verified in polynomial time, given a suitable “certificate” or “proof.” More formally, a problem is in NP if there exists a nondeterministic algorithm that can solve it in polynomial time. (A nondeterministic algorithm can be thought of as making “lucky guesses” that always lead to a solution if one exists.)

The P versus NP Problem

The Big Question: One of the most important unsolved problems in computer science and mathematics is whether P = NP. In other words, can every problem whose solution can be verified in polynomial time also be solved in polynomial time?
The Million-Dollar Question: The Clay Mathematics Institute offers a $1 million prize for a correct proof of either P = NP or P ≠ NP. This is one of the seven Millennium Prize Problems.
General Belief: Most computer scientists believe that P ≠ NP, but a formal proof remains elusive.

NP-Completeness

Definition: A problem $Π$ in NP is NP-complete if every other problem in NP can be reduced to $Π$ in polynomial time. This means that if you could find a polynomial-time algorithm for an NP-complete problem, you would automatically have a polynomial-time algorithm for every problem in NP, and thus prove that P = NP.
Significance: NP-complete problems are considered the “hardest” problems in NP. If you can show that a problem is NP-complete, it’s strong evidence that it’s unlikely to have an efficient (polynomial-time) algorithm. The Hamiltonian Cycle problem is NP-complete.

Examples of NP-Complete Problems

There are thousands of known NP-complete problems, arising in diverse fields. Some examples include:

Hamiltonian Cycle: Finding a cycle that visits every vertex exactly once.
Traveling Salesperson Problem (TSP): Finding the shortest route that visits a set of cities and returns to the starting city.
Knapsack Problem: Given a set of items with weights and values, find the most valuable subset of items that can fit within a given weight capacity.
Clique Problem: Finding the largest complete subgraph (clique) in a given graph.
Satisfiability (SAT): Determining if there’s an assignment of truth values to variables that makes a Boolean formula true.
Many, many others: Including problems in scheduling, logistics, graph theory, cryptography, game playing, and more.

The ubiquity of NP-complete problems underscores their importance in computer science.

The Traveling Salesman Problem (TSP)

The Traveling Salesman Problem (TSP) is a classic optimization problem that is closely related to the Hamiltonian cycle problem and is also NP-complete.

Problem Definition:

Given: A complete, weighted graph $G = (V, E)$ , where $V$ is a set of cities and the weight of edge $(u, v)$ , denoted by $l (u, v)$ , represents the distance (or cost, or travel time) between cities $u$ and $v$ .
Goal: Find a shortest tour (cycle) that visits each city exactly once and returns to the starting city. This tour is a Hamiltonian cycle, and we want to minimize the total length of the tour.

TSP is NP-Complete

Theorem: The Traveling Salesman Problem (TSP) is NP-complete.

A consequence of the NP-completeness of TSP and the specific reduction used above is that it’s also hard to approximate the optimal solution to TSP in general.

Theorem: For any constant $C > 0$ , there is no polynomial-time $C$ -approximation algorithm for the general TSP unless P = NP.

Explanation:

Approximation Algorithm: An $α$ -approximation algorithm for a minimization problem is an algorithm that finds a solution whose cost is at most $α$ times the optimal cost.
Inapproximability: The theorem states that we cannot find a polynomial-time algorithm that always finds a tour whose length is within a constant factor of the optimal tour length, unless P = NP.
Proof Idea Suppose we have a constant C > 0 s.t. there is a C-approximation algorithm, we can simply have the same reduction from earlier and run this algorithm. If the return value is 0 then it contains a hamilton cycle. Else not. This however would imply P=NP.

Practical Implications:

This inapproximability result means that we need to rely on heuristics, approximation algorithms with weaker guarantees (e.g., for special cases like the Metric TSP), or exponential-time algorithms (like dynamic programming) to solve TSP instances in practice. There are no silver bullets.

Continue here: 04 Cycles, Travelling Salesman Problem, Metric TSP, Matching, Augmenting Paths, Hall’s Marriage Theorem

CS Notes

Explorer

03 Hamiltonian Cycles, Dirac's Theorem, Complexity Theory, Traveling Salesman

Review: Cycles and Circuits