20 Detecting Colorful Paths using DP, Flow Networks

Lecture from: 06.05.2025 | Video: Homelab | Rui Zhangs Notes

Recap: Long Paths and Color Coding

In the previous lecture, we introduced the Long-Path Problem:

Given an undirected graph $G = (V, E)$ and an integer $B$ , does $G$ contain a simple path of length at least $B$ ?

This problem is NP-complete in general. However, for small values of $B$ - especially when $B = O (lo g n)$ - the Color Coding technique offers a randomized algorithm that runs in polynomial time.

Color Coding: The High-Level Idea

To detect a long path of length $B$ , we:

Randomly color the vertices of $G$ with $k = B + 1$ colors.
Search for a colorful path of length $B$ - that is, a path on $k$ vertices with all vertices assigned distinct colors.
Repeat the random coloring multiple times to increase the probability of success.

Previously, we analyzed the success probability of this strategy and showed that a randomized algorithm can solve Long-Path in

O ((2 e)^{k} \cdot k \cdot m)

time with high probability. However, we deferred the key question of how to efficiently check whether a given coloring contains a colorful path of length $B$ .

Detecting a Colorful Path Efficiently

Problem Statement

Given:

A graph $G = (V, E)$ ,
A coloring $γ : V \to [k]$ ,

Determine whether $G$ contains a simple path on $k$ vertices (length $k - 1$ ) such that the vertices along the path have distinct colors - i.e., a colorful path using all $k$ colors.

Suppose someone hands you a graph where each vertex is already colored. Can you find a path of length $k - 1$ where every vertex has a different color? We’ll solve this by carefully tracking all possible ways to build such a path - starting small and growing it one vertex at a time.

Dynamic Programming Approach

DP State

For each vertex $v \in V$ and integer $i \in {0, 1, \dots, k - 1}$ , define:

P_{i} (v) = {S \subseteq [k] ∣ ∣ S ∣ = i + 1, and there exists a colorful path of length i ending at v using exactly the colors in S}

The set $P_{i} (v)$ collects all color sets of size $i + 1$ that can appear on a colorful path of length $i$ ending at vertex $v$ .
The color $γ (v)$ must be in each such set $S \in P_{i} (v)$ .

$P_{i} (v)$ records all the sets of $i + 1$ colors that appear on colorful paths of length $i$ ending at vertex $v$ .

Example

Suppose a vertex $v$ has color 5. If there exists a path of length 2 ending at $v$ , say $(x, y, v)$ , where the colors are ${1, 4, 5}$ , then ${1, 4, 5} \in P_{2} (v)$ .

Base Case (Path Length 0)

For each vertex $v \in V$ :

P_{0} (v) = {{γ (v)}}

This corresponds to a trivial path of length 0 (just the vertex $v$ ) using only its own color.

At the beginning, each vertex can only form a trivial path with its own color.

Recurrence (Path Length $i \geq 1$ )

To compute $P_{i} (v)$ for $i \geq 1$ , we try to extend shorter colorful paths by one edge:

Look at every neighbor $x$ of $v$
For every color set $R \in P_{i - 1} (x)$
If $γ (v) \in / R$ , we can extend that path by adding $v$

We add the new color set $R \cup γ (v)$ to $P_{i} (v)$ .

Recurrence Formula:

P_{i} (v) = x \in N (v) ⋃ {R \cup {γ (v)} ∣ R \in P_{i - 1} (x), γ (v) \in / R}

This just says: if you can reach neighbor $x$ with a colorful path, and $v$ brings a new color, then we can grow the path by attaching $v$ to it.

Final Check: Did We Find a Colorful Path?

After filling in all $P_{i} (v)$ up to $i = k - 1$ , we scan all vertices $v$ :

If any $P_{k - 1} (v)$ contains a color set of size $k$ , then we found a colorful path of length $k - 1$ .

Formally:

\exists v \in V such that \exists S \in P_{k - 1} (v) with ∣ S ∣ = k ⟺ a colorful path of length k - 1 exists

If even one vertex has a complete set of $k$ colors in its $P_{k - 1}$ , we win!

Final Algorithm

Algorithm BUNT(G, γ, k):
  // Initialization
  for each v in V:
    P_0(v) ← { {γ(v)} }
 
  // Iterative DP for path lengths 1 to k-1
  for i from 1 to k-1:
    for each v in V:
      P_i(v) ← ∅
      for each neighbor x of v:
        for each R in P_{i-1}(x):
          if γ(v) not in R:
            add R ∪ {γ(v)} to P_i(v)
 
  // Check for colorful path
  for each v in V:
    for each S in P_{k-1}(v):
      if |S| == k:
        return YES  // found colorful path of length k-1
  return NO

Runtime Analysis

Let’s unpack how much time the dynamic programming algorithm actually takes.

How Much Work Are We Doing?

For each vertex $v$ , and each path length $i$ , the DP state $P_{i} (v)$ stores subsets of $[k]$ of size $i + 1$ .
There are at most $(i + 1 k)$ such subsets - since we’re choosing $i + 1$ colors from $k$ .

Think of $P_{i} (v)$ as storing all the colorful combinations of length- $i$ paths ending at $v$ . The number of such color combinations is limited by how many size- $(i + 1)$ subsets of $[k]$ exist.

How Do We Update the DP Table?

For every edge $x, v$ :

We look at each color set $R \in P_{i - 1} (x)$ .
If $γ (v) \in / R$ , we compute $R \cup γ (v)$ .

Each such operation takes $O (k)$ time - because these sets have at most $k$ elements.

We go through each neighbor and try to extend their colorful paths by one step - checking and merging color sets.

Cost Per Iteration (Fixed $i$ )

At step $i$ , the number of operations is roughly:

O (m \cdot (i k) \cdot k)

Each edge might lead to $(i k)$ sets, and each set takes $O (k)$ time to process.

Total Runtime (All $i$ from 1 to $k - 1$ )

Summing across all iterations:

i = 1 \sum k - 1 O (m \cdot (i k) \cdot k) = O (m \cdot k \cdot 2^{k})

This total bound comes from summing over all possible path lengths. The $\sum (i k)$ part gives $2^{k}$ , because it covers all subsets of size at least 1.

Derandomization

Our algorithm relies on random colorings - but what if we want to make it deterministic?

We can construct a special family of colorings:

A family $F$ of colorings $γ : V \to [k]$
For every subset $A \subseteq V$ of size $k$ , some coloring in $F$ gives all elements of $A$ different colors.

These are called $k$ -perfect hash families.

We can build such a family with size $O (e^{k} \cdot k^{O (1)} lo g n)$ .
Run the DP algorithm for each coloring in the family to make the method deterministic.

We will not prove the derandomization here, but it shows that the entire approach can be made deterministic with only a modest increase in runtime.

Summary

So to summarize, we built an efficient dynamic programming method that detects colorful paths in time: $O (m \cdot k \cdot 2^{k})$ . When combined with a random coloring, it gives a Monte Carlo algorithm - fast and correct with high probability.

What Is a Flow Network?

We now turn to a central topic in graph theory and combinatorial optimization: Flows in Networks. This topic is not only mathematically rich but also foundational in applications such as logistics, communication systems, traffic routing, and project scheduling.

A network is a directed graph equipped with a notion of flow capacity and distinguished source and sink nodes. Formally:

Definition: A network is a tuple $N = (V, A, c, s, t)$ where:

$(V, A)$ is a directed graph (digraph), with:
- $V$ : set of nodes (vertices)
- $A \subseteq V \times V$ : set of directed edge (aka “arc”)
$c : A \to R_{0}^{+}$ is the capacity function, assigning a non-negative real number $c (e)$ to each directed edge $e \in A$ , representing the maximum flow allowed on that directed edge.
$s \in V$ is the source node, the starting point of flow.
$t \in V$ , with $t \neq = s$ , is the sink node, the destination of flow.

What Is a Flow?

Given a network $N = (V, A, c, s, t)$ , a flow is a function $f : A \to R$ that satisfies the following conditions:

Capacity Constraint (Feasibility): For every directed edge $e \in A$ , the flow must respect the capacity: $0 \leq f (e) \leq c (e)$
Flow Conservation (Kirchhoff’s Law): For every interior node $v \in V ∖ s, t$ , the total incoming flow equals the total outgoing flow:
$u \in V (u, v) \in A \sum f (u, v) = w \in V (v, w) \in A \sum f (v, w)$

This ensures that no flow is lost or created at any intermediate node - it merely passes through.

Value of a Flow

The value of a flow quantifies how much flow is sent from the source to the sink.

The value of a flow $f$ , denoted $val (f)$ , is defined as the net outflow from the source:

val (f) := u \in V (s, u) \in A \sum f (s, u) - u \in V (u, s) \in A \sum f (u, s)

In most cases, especially in standard formulations, there are no incoming edge to the source, so this simplifies to:

val (f) = (s, u) \in A \sum f (s, u)

Example Calculation:

Suppose from the source $S$ we have:

$f (S, A) = 3$
$f (S, B) = 5$ and one directed edge with reverse flow:
$f (C, S) = 1$

Then: $val (f) = (3 + 5) - 1 = 7$

Integer Flows

In many practical cases, the capacity function $c$ assigns integer values to edges. This leads to an important special case:

A flow $f$ is called integer-valued (or integral) if:

f (e) \in Z for all e \in A

Such flows are particularly relevant when dealing with indivisible units, like packages, vehicles, or data packets.

Net Inflow at the Sink

There is a fundamental symmetry in flow networks: what leaves the source must arrive at the sink.

Lemma: Flow Conservation at the Network Level

The net inflow into the sink equals the value of the flow:

netinflow (t) := (u, t) \in A \sum f (u, t) - (t, u) \in A \sum f (t, u) = val (f)

Proof

We start by summing the flow conservation equations for all interior nodes $v \in V ∖ s, t$ :

(u, v) \in A \sum f (u, v) - (v, w) \in A \sum f (v, w) = 0

Summing over all such $v$ :

v \in V ∖ {s, t} \sum (u, v) \sum f (u, v) - (v, w) \sum f (v, w) = 0

Now observe:

Every flow $f (x, y)$ appears once with a + sign for $y$ and once with a − sign for $x$ , unless $x$ or $y$ is $s$ or $t$ .
So only the source and sink terms remain.

Rewriting:

0 = = - val (f) = - n e t o u t f l o w (f) (u, s) \sum f (u, s) - (s, u) \sum f (s, u) + = netinflow (t) (u, t) \sum f (u, t) - (t, u) \sum f (t, u)

Hence:

netinflow (t) = val (f)

This tells us that in a stable network, the total amount of flow sent from the source equals what is received at the sink.

Looking Ahead: The Maximum Flow Problem

This sets the stage for the central algorithmic question in flow networks:

Find a flow $f$ in $N = (V, A, c, s, t)$ that maximizes $val (f)$ .

We will also uncover deep theoretical results like the Max-Flow Min-Cut Theorem, which links the maximum amount of flow that can be pushed from source to sink with the structure of the network itself.

Imagine drawing a curve through the graph that separates $s$ and $t$ . Such “cuts” are dual to flows and will guide our understanding of the network’s capacity limitations.

We’ll look at this and other topics in the coming lectures…

Continue here: 21 Flow Networks, Cuts, Max-Flow Min-Cut, Residual Networks, Ford-Fulkerson

CS Notes

Explorer

20 Detecting Colorful Paths using DP, Flow Networks

Recap: Long Paths and Color Coding

Color Coding: The High-Level Idea

Detecting a Colorful Path Efficiently

Problem Statement

Dynamic Programming Approach

DP State

Example

Base Case (Path Length 0)

Recurrence (Path Length $i \geq 1$ )

Final Check: Did We Find a Colorful Path?

Final Algorithm

Runtime Analysis

How Much Work Are We Doing?

How Do We Update the DP Table?

Cost Per Iteration (Fixed $i$ )

Total Runtime (All $i$ from 1 to $k - 1$ )

Derandomization

Summary

What Is a Flow Network?

What Is a Flow?

Value of a Flow

Integer Flows

Net Inflow at the Sink

Lemma: Flow Conservation at the Network Level

Proof

Looking Ahead: The Maximum Flow Problem

Table of Contents

Graph View

CS Notes

Explorer

20 Detecting Colorful Paths using DP, Flow Networks

Recap: Long Paths and Color Coding

Color Coding: The High-Level Idea

Detecting a Colorful Path Efficiently

Problem Statement

Dynamic Programming Approach

DP State

Example

Base Case (Path Length 0)

Recurrence (Path Length i≥1)

Final Check: Did We Find a Colorful Path?

Final Algorithm

Runtime Analysis

How Much Work Are We Doing?

How Do We Update the DP Table?

Cost Per Iteration (Fixed i)

Total Runtime (All i from 1 to k−1)

Derandomization

Summary

What Is a Flow Network?

What Is a Flow?

Value of a Flow

Integer Flows

Net Inflow at the Sink

Lemma: Flow Conservation at the Network Level

Proof

Looking Ahead: The Maximum Flow Problem

Table of Contents

Graph View

Recurrence (Path Length $i \geq 1$ )

Cost Per Iteration (Fixed $i$ )

Total Runtime (All $i$ from 1 to $k - 1$ )