Dynamic Programming Fundamentals
Understanding overlapping subproblems, optimal substructure, memoization, and tabulation: the core ideas behind dynamic programming.
Terminology
- Dynamic programming (DP): an algorithmic technique that solves complex problems by breaking them into overlapping subproblems, solving each subproblem once, and storing the result for reuse.
- Overlapping subproblems: when a recursive solution to a problem solves the same subproblem multiple times. DP eliminates this redundancy.
- Optimal substructure: a problem has optimal substructure if its optimal solution can be constructed from optimal solutions of its subproblems.
- Memoization (top-down): a technique where you write the natural recursive solution but cache results of subproblems in a table. Subproblems are solved on demand.
- Tabulation (bottom-up): a technique where you fill in a table iteratively, starting from the smallest subproblems and building up to the final answer. No recursion needed.
- State: the set of parameters that uniquely identifies a subproblem. For Fibonacci, the state is a single integer $n$. For more complex problems, the state may be a tuple of multiple values.
- Transition (recurrence): the formula that expresses the solution to a subproblem in terms of smaller subproblems. This is the core equation you derive for any DP problem.
- Base case: the smallest subproblem(s) whose answer is known directly, without further recursion.
What & Why
Dynamic programming is one of the most powerful algorithmic techniques, and also one of the most misunderstood. At its core, DP is just "recursion without redundancy." If a problem can be broken into subproblems that overlap (the same subproblem appears multiple times in the recursion tree), DP stores each result so it is computed only once.
Two conditions must hold for DP to apply:
- Overlapping subproblems: the naive recursive solution recomputes the same subproblems many times
- Optimal substructure: the optimal solution to the whole problem can be built from optimal solutions to subproblems
When both conditions hold, DP transforms an exponential brute-force solution into a polynomial one. The classic example is Fibonacci: naive recursion is O(2^n), but DP reduces it to O(n).
There are two implementation strategies:
- Memoization (top-down): write the recursive solution, add a cache. Solves only the subproblems that are actually needed.
- Tabulation (bottom-up): fill a table iteratively from base cases up. Avoids recursion overhead and is often more cache-friendly.
Both produce the same asymptotic complexity. The choice between them is usually about coding style and constant factors.
How It Works
The Fibonacci Example
The Fibonacci sequence is defined by $F(n) = F(n-1) + F(n-2)$ with base cases $F(0) = 0$ and $F(1) = 1$. The naive recursive call tree for $F(5)$ shows massive redundancy:
$F(3)$ is computed twice. $F(2)$ is computed three times. For larger $n$, this redundancy grows exponentially. DP eliminates it by storing each $F(i)$ the first time it is computed.
Memoization (Top-Down)
With memoization, you keep the recursive structure but add a lookup table. Before computing F(n), check if the answer is already cached. If so, return it immediately. If not, compute it, store it, and return it.
The recursion tree effectively becomes a DAG (directed acyclic graph) where each node is visited at most once.
Tabulation (Bottom-Up)
With tabulation, you eliminate recursion entirely. Create an array dp of size n+1, set the base cases dp[0] = 0 and dp[1] = 1, then fill forward:
For $i = 2$ to $n$: $\text{dp}[i] = \text{dp}[i-1] + \text{dp}[i-2]$
This is a simple loop with no recursion overhead and excellent cache locality.
The DP Problem-Solving Framework
For any DP problem, follow these steps:
- Define the state: what parameters uniquely identify a subproblem?
- Write the recurrence: how does the answer for a state relate to smaller states?
- Identify base cases: what are the trivial subproblems with known answers?
- Determine the computation order: which states depend on which? (for tabulation)
- Optimize space if possible: often you only need the last row or last two values
Complexity Analysis
| Approach | Time | Space | Notes |
|---|---|---|---|
| Naive recursion (Fibonacci) | $O(2^n)$ | $O(n)$ | Exponential due to redundant calls |
| Memoized (Fibonacci) | $O(n)$ | $O(n)$ | Each subproblem solved once |
| Tabulated (Fibonacci) | $O(n)$ | $O(n)$ | Iterative, no recursion overhead |
| Space-optimized (Fibonacci) | $O(n)$ | $O(1)$ | Only store last two values |
General DP Complexity
For a DP problem with $S$ possible states and $T$ time per transition:
For Fibonacci: $S = n$ states, $T = O(1)$ per transition, so $O(n)$ total.
For more complex problems like edit distance on strings of length $m$ and $n$: $S = m \times n$ states, $T = O(1)$ per transition, so $O(mn)$ total.
Why DP Works: Counting Subproblems
The naive Fibonacci recursion makes $O(2^n)$ calls because the recursion tree is a full binary tree of depth $n$. But there are only $n + 1$ distinct subproblems ($F(0)$ through $F(n)$). DP exploits this gap between the number of recursive calls and the number of unique subproblems.
Implementation
Fibonacci: Memoization (Top-Down)
ALGORITHM FibMemo(n, memo)
INPUT: non-negative integer n, memo table (initially all EMPTY)
OUTPUT: the nth Fibonacci number
IF n <= 1 THEN RETURN n
IF memo[n] != EMPTY THEN
RETURN memo[n]
END IF
memo[n] = FibMemo(n - 1, memo) + FibMemo(n - 2, memo)
RETURN memo[n]
END
Fibonacci: Tabulation (Bottom-Up)
ALGORITHM FibTab(n)
INPUT: non-negative integer n
OUTPUT: the nth Fibonacci number
IF n <= 1 THEN RETURN n
CREATE array dp of size n + 1
dp[0] = 0
dp[1] = 1
FOR i = 2 TO n DO
dp[i] = dp[i - 1] + dp[i - 2]
END FOR
RETURN dp[n]
END
Fibonacci: Space-Optimized
ALGORITHM FibOptimized(n)
INPUT: non-negative integer n
OUTPUT: the nth Fibonacci number
IF n <= 1 THEN RETURN n
prev2 = 0
prev1 = 1
FOR i = 2 TO n DO
current = prev1 + prev2
prev2 = prev1
prev1 = current
END FOR
RETURN prev1
END
Minimum Cost Climbing Stairs
A classic introductory DP problem: given an array cost where cost[i] is the cost of stepping on stair $i$, find the minimum cost to reach the top. You can start from stair 0 or 1, and at each stair you can climb 1 or 2 steps.
ALGORITHM MinCostClimbing(cost, n)
INPUT: array cost of size n
OUTPUT: minimum cost to reach past stair n-1
IF n = 0 THEN RETURN 0
IF n = 1 THEN RETURN 0
prev2 = 0
prev1 = 0
FOR i = 2 TO n DO
current = MIN(prev1 + cost[i - 1], prev2 + cost[i - 2])
prev2 = prev1
prev1 = current
END FOR
RETURN prev1
END
Real-World Applications
- Sequence alignment: bioinformatics uses DP (Needleman-Wunsch, Smith-Waterman) to align DNA and protein sequences, finding the best match between two strings.
- Shortest paths: the Bellman-Ford algorithm and Floyd-Warshall algorithm are both DP solutions to shortest-path problems in graphs.
- Text editors: the diff algorithm that powers
git diffuses DP to compute the longest common subsequence between two file versions. - Speech recognition: hidden Markov models use the Viterbi algorithm (a DP algorithm) to find the most likely sequence of words given audio input.
- Resource allocation: knapsack-style DP solves budget allocation, cargo loading, and investment portfolio optimization problems.
Key Takeaways
- DP applies when a problem has overlapping subproblems and optimal substructure
- Memoization (top-down) adds caching to a recursive solution; tabulation (bottom-up) fills a table iteratively. Both achieve the same complexity.
- The key to solving any DP problem is defining the state, writing the recurrence, and identifying base cases
- DP transforms exponential brute-force solutions into polynomial ones by ensuring each subproblem is solved exactly once
- Space optimization is often possible: if the recurrence only depends on the previous row or the last few values, you can reduce space from $O(n)$ to $O(1)$