Lecture from: 24.10.2024 | Video: Videos ETHZ | Official Script
Dynamic Programming
Dynamic programming is a powerful technique that empowers even mere mortal programmers to tackle complex problems. It builds upon the idea of using invariants, similar to what we explored in sorting algorithms.
Instead of directly solving a problem, dynamic programming breaks it down into smaller overlapping subproblems and recursively solves them. The key difference lies in storing the results of these subproblems to avoid redundant calculations, leading to significant efficiency gains.
Recursion: A Primer
Let’s illustrate this with the classic Fibonacci sequence example:
Pseudocode:
This recursive approach defines fibonacci(n)
in terms of two smaller Fibonacci values. However, repeated calculations for the same subproblems lead to exponential time complexity.
Runtime Analysis of Fibonacci
Let’s analyze the runtime complexity of our recursive Fibonacci implementation:
- Base Cases: where ‘c’ is some constant (representing the time for the base cases).
- Recursive Step: where ‘d’ is another constant (accounting for additional operations within the recursive step).
Notice how this recurrence relation resembles the Fibonacci sequence calculation itself! This similarity is crucial to understanding the runtime.
Using Induction: We can use induction to show that: .
Where represents the nth Fibonacci number. (See exercise 3.4 for more info on how we deduced this…)
Therefore, the runtime complexity of this recursive Fibonacci implementation is exponential, specifically .
This begs the question: what can we do to combat this inefficiency?
Memoization (Solution 1)
Memoization provides a powerful technique to optimize our recursive Fibonacci implementation.
Instead of recalculating Fibonacci numbers repeatedly, we store the results of previously computed values in a data structure like an array or dictionary. This avoids redundant calculations, significantly improving performance.
Let’s illustrate this with Python and a memo
array:
Pseudocode:
Explanation:
- We initialize an empty list
memo
to store calculated Fibonacci numbers. - Before each recursive call, we check if the result for
n
is already inmemo
. If it is, we directly return the stored value. Otherwise, we calculate the Fibonacci number recursively and store it inmemo
before returning.
Runtime:
Memoization dramatically reduces the runtime complexity from exponential to linear . This is because each Fibonacci number is calculated only once and subsequently retrieved from the memo array, resulting in a single pass through the problem space.
Iterative Solution (Solution 2)
While memoization optimizes recursion, an iterative approach can be even more efficient and often easier to understand. This “bottom-up” solution builds the Fibonacci sequence incrementally from smaller values up to the desired n
.
Pseudocode:
Runtime:
The iterative solution has a runtime complexity of . We perform a fixed number of iterations (from 2 to n
), making the time proportional to the input size.
The Core Idea
Dynamic programming (DP) hinges on a powerful concept: breaking down complex problems into smaller, overlapping subproblems. These subproblems can often be solved recursively, but DP’s real magic lies in how we handle these solutions.
We aim to solve each subproblem only once and store its result. This eliminates redundant calculations, leading to significant efficiency gains. DP offers various strategies for tackling these subproblems:
- Top-Down: Start with the main problem and recursively break it down into smaller subproblems. Techniques like memoization (storing results in a table) are often used in top-down approaches.
- Bottom-Up: Solve the smallest subproblems first and progressively build up to the larger ones. This often involves iteratively computing solutions for increasing input sizes.
Max Subarray Problem Revisited
Let’s revisit the classic “Max Subarray Sum” problem from previous lectures (03 Max Subarray Sum): Given an array of numbers, find the contiguous subarray with the largest sum.
Subproblem Definition: The key insight is that to find the maximum sum subarray ending at index j
, we can use the result of the maximum sum subarray ending at index j-1
. This forms our recursive relationship:
By clearly defining the subproblem and establishing a recursive relationship, we pave the way for efficient dynamic programming solutions.
Jump Game
Problem: Given an array of positive (non-zero) numbers A[1..n]
representing the maximum jump length from each position, determine the minimum number of jumps required to reach the last element, A[n]
, starting from position 1.
Input: An array of positive integers A[1..n]
where n
represents the total number of elements in the array.
Game Rules:
- Start at position 1 of the array.
- From any position
i
, you can jump to any positionj
wherei + 1 ≤ j ≤ i + A[i]
.
Objective: Find the minimum number of jumps needed to reach the last element, A[n]
.
Let’s solve this.
Try 1: Breaking Down the Problem
Define a subproblem:
Consider an example array A = [1, 3, 5, 3, 2, 1, 2, 4, 4, 2, 9]
. How do we solve for S[8]
?
To reach index 8, consider the positions reachable in one jump:
- Jump to position 8 from position 3 (since
A[3] = 5
and3 + 5 >= 8
). - Jump to position 8 from position 7 (since
A[7] = 4
and7 + 4 >= 8
).
Thus, the solution for S[8]
is:
This generalizes to our recursive relationship for any index i
:
In other words, find the minimum jump count from reachable positions one step earlier.
Bottom-Up Implementation
Let’s implement a bottom-up solution to calculate the minimum jumps required.
Pseudocode:
Explanation:
- Initialization: Create a list
S
of sizen+1
filled with infinity (float('inf')
), representing the minimum jumps needed to reach each index, initially assuming all indices are unreachable. SetS[1]
to 0 because it takes zero jumps to reach the starting position. - Iteration: For each index
i
from 2 ton
, consider all possible jumping positionsj
from 1 toi-1
. If a jump from positionj
is valid (i.e.,j + A[j - 1] >= i
), updateS[i]
to be the minimum of its current value and1 + S[j]
. This represents taking one jump from positionj
and adding the minimum jumps needed to reachj
. - Return: Finally, return
S[n]
, which contains the minimum jumps required to reach the last indexn
.
Runtime Complexity
This bottom-up approach has a runtime complexity of .
- Outer Loop: The outer loop iterates
n
times (for each indexi
). - Inner Loop: For each
i
, the inner loop potentially iterates up toi
times (considering all possible jumping positionsj
).
Mathematically, we can express this as: The nested loop structure results in a quadratic time complexity.
Try 2: Switching Perspectives - Geht es besser?
Yes, we can potentially solve this problem more efficiently. Instead of focusing on the position S[i]
(minimum jumps to reach index i
), let’s consider the maximum reachable index given a certain number of jumps:
Thus, we redefine the problem in terms of reachable positions for each jump count ().
Note for readers: Since I probably wasn’t the only one slightly confused with what’s going on, I suggest watching this video explaining the efficient approach…
This change in perspective leads to a more efficient solution. Let’s illustrate with an example using A = [1, 3, 5, 3, 2, 1, 2, 4, 4, 2, 9]
.
Example: To find M[3]
(the maximum index reachable with 3 jumps), build upon M[2]
:
- With
M[2]
, we can reach indices 0 through 4. - For each
i
within that range (M[2] <= i
), consider the positioni + A[i]
reachable with one more jump (fromM[2]
).
Recursive Formulation:
We start with . To compute for recursively, we consider all positions reachable with at most jumps—i.e., positions . From each such position , we can jump up to , so the maximum index reachable with jumps is:
This formulation builds the reachable indices iteratively by considering jumps from the previously reachable positions.
Pseudocode:
Optimization of the Recurrence
Looking closer at the recurrence formula for , we see that to compute , it suffices to consider only the indices that require exactly jumps to reach, rather than all reachable with jumps. These indices are the ones that satisfy:
This gives an improved recurrence relation:
with the additional base case .
Pseudocode:
This optimized recurrence allows us to reduce the runtime of the algorithm to , as each index contributes to only one maximum calculation across the entire process.
Lower Bound
The time complexity of is optimal for this problem. To demonstrate this, we can construct a worst-case input where any algorithm must examine each element at least once.
Consider the array . Here, we must read all entries except the last to correctly compute the solution since each entry directly affects the reachability of subsequent indices. Any algorithm must read all entries, thus guaranteeing a worst-case time complexity of for this problem.
Longest Common Subsequence
Leetcode: Longest Common Subsequence
Problem: Given two strings, A and B, find the length of their longest common non-contiguous subsequence. Input: Two strings, A and B, consisting of lowercase English letters. Output: The length of the longest common non-contiguous subsequence of A and B.
Example
Input:
A = "rabbbit"
B = "rabbit"
Output:
5 // "rabbi" is a common non-contiguous subsequence of length 5.
Input:
A = "tiger"
B = "ziege"
Output:
3 // "ige" is a common non-contiguous subsequence of length 3.
Understanding Non-Contiguous Subsequences
- A subsequence is a sequence formed by deleting some (or no) characters from another sequence without changing the order of the remaining characters. For example, “ten” and “tin” are subsequences of “kitten”.
- A non-contiguous subsequence allows gaps between chosen characters. “rabbi” is a non-contiguous subsequence of “rabbbit”.
This problem type has wide applications in computer science, including:
- Bioinformatics: Analyzing similarities and differences between DNA sequences.
- Text Processing: Finding common patterns or motifs within large text corpora.
Attempt 1: Subproblem A[1..i]
and B[1..i]
While thinking about subproblems (A[1..i]
, B[1..i]
) is a good start, simply carrying over the LCS from the previous subproblem won’t always work. We need to consider all possible substrings and their contributions to the overall LCS.
The Flaw:
Imagine A = “tiger” and B = “ziege”. At i = 4
, the LCS of A[1..4]
(“tige”) and B[1..4]
(“zieg”) is 2 (we find “ie”). However, at i = 5
, the correct LCS becomes 3 (“ige”). We can’t directly derive this from the previous subproblem’s result because we need to account for the potential inclusion of ‘e’ from B.
The Challenge:
We’d have to analyze every possible substring combination up to i = 5
to find the true LCS. This becomes computationally expensive and inefficient.
Attempt 2: Subproblem Definition and Recursion – A Deeper Dive
Note to reader: Here too, I suggest watching this great video…
Let’s shift our focus towards a more structured approach using subproblems and recursion.
We aim to define a subproblem that represents the LCS (Least Common Subsequence) of two subsequences, A[1..i]
and B[1..j]
, denoted as L[i, j]
. Our goal is to build a recursive relationship that allows us to calculate L[i, j]
based on smaller subproblems.
Breaking Down the Problem:
We can express this subproblem recursively based on three key cases:
-
Base Case (Empty Subsequences): If either sequence is empty (
i = 0
orj = 0
), there are no common elements, so the LCS is 0. -
Matching Characters: If the last characters of both subsequences (
A[i]
andB[j]
) match, we can include this character in our LCS. The remaining LCS is then calculated by considering the subsequences without their last elements: -
Non-Matching Characters: If
A[i]
andB[j]
don’t match, we consider two possibilities:
- Include the character
a_i
fromA
and calculate the LCS ofA[1..i-1]
andB[1..j]
: - Include the character
b_j
fromB
and calculate the LCS ofA[1..i]
andB[1..j-1]
:
- We choose the option that yields a longer LCS:
Combining the Cases:
Finally, we combine these three cases into a single recursive formula:
Bottom-Up Approach
The bottom-up approach offers an efficient way to compute the longest common subsequence (LCS) by systematically filling a 2D table, reflecting the subproblem solutions as we go. Think of this table as a grid where each cell represents the LCS length for specific prefixes of our input strings A
and B
.
Pseudocode:
Explanation:
- Table Initialization: We create a
L
table of size(n+1) x (m+1)
, filled with zeros initially. Each cellL[i][j]
will store the length of the LCS for substringsA[0..i-1]
andB[0..j-1]
. The extra row and column handle empty substring cases. - Table Filling: We iterate through each character position
i
in stringA
andj
in stringB
.- If
A[i-1]
matchesB[j-1]
, the LCS length increases by 1 compared to the diagonal cell (L[i-1][j-1]
). - Otherwise, we take the maximum LCS length from the cell above (
L[i-1][j]
) or to the left (L[i][j-1]
).
- If
- Result Extraction: The final result, the length of the LCS for
A
andB
, is stored inL[n][m]
.
Note that after filling the table, we backtrack from L[n][m]
(the bottom-right corner) to reconstruct the actual LCS.
- If
A[i-1]
equalsB[j-1]
, it’s part of the LCS, so we add it to the beginning of thelcs
string and move diagonally up-left (i-=1
,j-=1
). - Otherwise, we move either up (
i -= 1
) or left (j -= 1
) based on which cell has a larger value in the table.
Time Complexity
The time complexity of this bottom-up approach is , which translates to when both input strings, A
and B
, have a length of approximately N
. While this is efficient for many cases, the question remained: geht es besser?
It turns out that under certain assumptions, a slight improvement to is possible, although the implementation is quite complex. The real question was whether more substantial improvements were achievable – could we find an algorithm with a time complexity of for some ? This remained open for decades.
Finally, in 2015, two mathematicians/computer scientists provided a definitive answer: no, such a substantial improvement is impossible.
Edit Distance
This problem tackles the classic concept of edit distance, which quantifies the minimum number of operations required to transform one string into another. Think of it as the “distance” between two strings in terms of insertions, deletions, or substitutions.
Given two strings A
and B
, return the minimum number of operations needed to convert A
into B
. The allowed operations are:
- Insertion: Add a character to
A
. - Deletion: Remove a character from
A
. - Substitution: Replace a character in
A
with a different character.
Input:
- Two strings,
A
andB
, consisting of lowercase English letters.
Output:
- An integer representing the minimum edit distance between
A
andB
.
Example:
Input: word1 = "horse", word2 = "ros"
Output: 3
Explanation:
- One way to transform "horse" to "ros" is:
- Delete "h": "orse"
- Delete "e": "ors"
- Substitute 's' for 'r': "ros"
Let’s break down how to solve the Edit Distance problem using a recursive approach.
Understanding the Problem
The goal is to find the minimum number of edits (insertions, deletions, or substitutions) needed to transform one string (word1
) into another (word2
).
Note to the reader: Once again, I suggest watching this video…
Defining the Recursive Relation
Let ED(i, j)
represent the edit distance between the first i
characters of word1
and the first j
characters of word2
. We can express this recursively:
ED(i, j) = min(
1 + ED(i - 1, j), // Deletion case: Remove A[i]
1 + ED(i, j - 1), // Insertion case: Insert B[j]
ED(i - 1, j - 1) + (A[i] != B[j]) // Substitution case: Replace A[i] if needed
)
Explanation:
- Deletion Case (
1 + ED(i - 1, j)
): If the last character ofword1
(A[i]
) is not present inword2
, we need to delete it. The edit distance becomes 1 plus the distance between the remaining substrings (ED(i - 1, j)
). - Insertion Case (
1 + ED(i, j - 1)
): If the last character ofword2
(B[j]
) is not present inword1
, we need to insert it. The edit distance becomes 1 plus the distance betweenword1
and the remaining substring ofword2
(ED(i, j - 1)
). - Substitution Case (
ED(i - 1, j - 1) + (A[i] != B[j])
): If the last characters of both strings match (A[i] == B[j]
), no operation is needed. Otherwise, we need a substitution, adding 1 to the distance between the remaining substrings (ED(i - 1, j - 1)
).
Base Cases:
ED(0, 0) = 0
: The edit distance between empty strings is 0.ED(i, 0) = i
: To transform ani
-character string into an empty one, we needi
deletions.ED(0, j) = j
: To transform an empty string into aj
-character string, we needj
insertions.
Pseudocode:
Runtime
The runtime complexity of this recursive solution is , where n and m are the lengths of the input strings. This is because we visit each cell in the dp
array exactly once.