Advanced DP Patterns: Knapsack, LCS, Matrix Chain, and Bitmask DP
Four essential dynamic programming patterns that appear everywhere: 0/1 Knapsack, Longest Common Subsequence, Matrix Chain Multiplication, and Bitmask DP.
Terminology
- 0/1 Knapsack: a problem where you select items (each with a weight and value) to maximize total value without exceeding a weight capacity. Each item is either taken or left (no fractions).
- Longest Common Subsequence (LCS): the longest sequence of characters that appears in the same relative order in both input strings, but not necessarily contiguously.
- Subsequence: a sequence derived from another by deleting zero or more elements without changing the order of the remaining elements.
- Matrix Chain Multiplication: the problem of finding the optimal way to parenthesize a chain of matrix multiplications to minimize the total number of scalar multiplications.
- Bitmask DP: a DP technique where the state includes a bitmask (an integer whose bits represent which elements of a set have been "used" or "visited").
- State space: the set of all possible states in a DP formulation. The size of the state space determines the time and space complexity.
- Interval DP: a DP pattern where the state represents a contiguous interval $[i, j]$ and the recurrence tries all possible split points within that interval.
- Subset enumeration: iterating over all subsets of a set, typically using bitmasks from $0$ to $2^n - 1$.
What & Why
The previous post covered DP fundamentals with simple 1D problems. Real-world DP problems are more complex, involving 2D tables, interval ranges, or exponential state spaces compressed via bitmasks. This post covers four patterns that appear repeatedly in interviews, competitive programming, and production systems.
Why these four?
- Knapsack is the template for all "choose or skip" optimization problems: budget allocation, cargo loading, feature selection
- LCS is the foundation of diff algorithms, version control, and bioinformatics sequence alignment
- Matrix Chain Multiplication represents interval DP, which appears in optimal BST construction, polygon triangulation, and parsing
- Bitmask DP handles problems over small sets (up to ~20 elements) where you need to track which elements have been used, like the Traveling Salesman Problem
How It Works
0/1 Knapsack
Given $n$ items with weights $w_1, \ldots, w_n$ and values $v_1, \ldots, v_n$, and a knapsack of capacity $W$, find the subset of items that maximizes total value without exceeding $W$.
State: $\text{dp}[i][c]$ = maximum value using items $1 \ldots i$ with capacity $c$.
Recurrence:
The first option skips item $i$. The second takes it (only if $c \geq w_i$).
Longest Common Subsequence (LCS)
Given two strings $X$ of length $m$ and $Y$ of length $n$, find the length of their longest common subsequence.
State: $\text{dp}[i][j]$ = LCS length of $X[1 \ldots i]$ and $Y[1 \ldots j]$.
Recurrence:
Matrix Chain Multiplication
Given matrices $A_1, A_2, \ldots, A_n$ with dimensions $p_0 \times p_1, p_1 \times p_2, \ldots, p_{n-1} \times p_n$, find the parenthesization that minimizes total scalar multiplications.
State: $\text{dp}[i][j]$ = minimum cost to multiply $A_i \cdot A_{i+1} \cdots A_j$.
Recurrence (try every split point $k$):
Bitmask DP (Traveling Salesman)
Given $n$ cities and a distance matrix, find the shortest tour visiting every city exactly once and returning to the start.
State: $\text{dp}[\text{mask}][i]$ = minimum cost to visit the set of cities represented by $\text{mask}$, ending at city $i$.
Recurrence:
The bitmask encodes which cities have been visited. With $n$ cities, there are $2^n$ possible masks, giving $O(2^n \cdot n)$ states.
Complexity Analysis
| Problem | Time | Space | States |
|---|---|---|---|
| 0/1 Knapsack | $O(nW)$ | $O(nW)$ or $O(W)$ | $n \times W$ |
| LCS | $O(mn)$ | $O(mn)$ or $O(\min(m,n))$ | $m \times n$ |
| Matrix Chain | $O(n^3)$ | $O(n^2)$ | $O(n^2)$ intervals |
| TSP (Bitmask) | $O(2^n \cdot n^2)$ | $O(2^n \cdot n)$ | $2^n \times n$ |
Knapsack: Pseudo-Polynomial
The 0/1 Knapsack runs in $O(nW)$, which looks polynomial. But $W$ is a number, not the size of the input. The input size for $W$ is $\log W$ bits. So $O(nW) = O(n \cdot 2^{\log W})$, which is exponential in the input size. This is called pseudo-polynomial time.
TSP: Exponential but Practical
The brute-force TSP solution checks all $n!$ permutations. Bitmask DP reduces this to $O(2^n \cdot n^2)$, which is still exponential but far more practical. For $n = 20$: $20! \approx 2.4 \times 10^{18}$ vs $2^{20} \times 400 \approx 4.2 \times 10^8$.
Implementation
0/1 Knapsack
ALGORITHM Knapsack(weights, values, n, W)
INPUT: arrays weights and values of size n, capacity W
OUTPUT: maximum achievable value
CREATE 2D array dp of size (n + 1) x (W + 1), initialized to 0
FOR i = 1 TO n DO
FOR c = 0 TO W DO
dp[i][c] = dp[i - 1][c] // skip item i
IF weights[i - 1] <= c THEN
take = dp[i - 1][c - weights[i - 1]] + values[i - 1]
dp[i][c] = MAX(dp[i][c], take) // take item i
END IF
END FOR
END FOR
RETURN dp[n][W]
END
Longest Common Subsequence
ALGORITHM LCS(X, Y, m, n)
INPUT: strings X of length m and Y of length n
OUTPUT: length of longest common subsequence
CREATE 2D array dp of size (m + 1) x (n + 1), initialized to 0
FOR i = 1 TO m DO
FOR j = 1 TO n DO
IF X[i - 1] = Y[j - 1] THEN
dp[i][j] = dp[i - 1][j - 1] + 1
ELSE
dp[i][j] = MAX(dp[i - 1][j], dp[i][j - 1])
END IF
END FOR
END FOR
RETURN dp[m][n]
END
Matrix Chain Multiplication
ALGORITHM MatrixChain(p, n)
INPUT: array p of size n + 1 (matrix i has dimensions p[i-1] x p[i])
OUTPUT: minimum number of scalar multiplications
CREATE 2D array dp of size (n + 1) x (n + 1), initialized to 0
// length = chain length (2 means two matrices)
FOR length = 2 TO n DO
FOR i = 1 TO n - length + 1 DO
j = i + length - 1
dp[i][j] = INFINITY
FOR k = i TO j - 1 DO
cost = dp[i][k] + dp[k + 1][j] + p[i - 1] * p[k] * p[j]
dp[i][j] = MIN(dp[i][j], cost)
END FOR
END FOR
END FOR
RETURN dp[1][n]
END
Bitmask DP (Traveling Salesman)
ALGORITHM TSP(dist, n)
INPUT: n x n distance matrix dist
OUTPUT: minimum cost of a tour visiting all cities starting and ending at city 0
fullMask = (1 << n) - 1
CREATE 2D array dp of size (1 << n) x n, initialized to INFINITY
dp[1][0] = 0 // start at city 0, only city 0 visited
FOR mask = 1 TO fullMask DO
FOR last = 0 TO n - 1 DO
IF dp[mask][last] = INFINITY THEN CONTINUE
IF bit last is not set in mask THEN CONTINUE
FOR next = 0 TO n - 1 DO
IF bit next is set in mask THEN CONTINUE // already visited
newMask = mask OR (1 << next)
newCost = dp[mask][last] + dist[last][next]
dp[newMask][next] = MIN(dp[newMask][next], newCost)
END FOR
END FOR
END FOR
// Find minimum cost to return to city 0
answer = INFINITY
FOR last = 0 TO n - 1 DO
answer = MIN(answer, dp[fullMask][last] + dist[last][0])
END FOR
RETURN answer
END
Real-World Applications
- Knapsack in resource allocation: cloud providers solve knapsack variants to pack virtual machines onto physical servers, maximizing utilization within memory and CPU constraints.
- LCS in version control:
git diffanddiffutilities compute the LCS of two file versions to produce minimal edit scripts. - Matrix Chain in query optimization: database query planners determine the optimal join order for multi-table queries, which is structurally identical to matrix chain multiplication.
- Bitmask DP in logistics: vehicle routing problems (a generalization of TSP) use bitmask DP for small fleet sizes to find optimal delivery routes.
- LCS in bioinformatics: DNA sequence alignment algorithms are direct extensions of LCS, scoring matches, mismatches, and gaps.
Key Takeaways
- The 0/1 Knapsack pattern ("take or skip") applies to any problem where you choose a subset of items to optimize a value under constraints
- LCS is the foundation of diff algorithms and sequence alignment, solvable in $O(mn)$ with a 2D table
- Matrix Chain Multiplication exemplifies interval DP: the state is a range $[i, j]$ and you try every split point
- Bitmask DP compresses set-based states into integers, making problems over small sets ($n \leq 20$) tractable
- Recognizing which DP pattern a problem fits is the hardest part; the implementation follows naturally from the recurrence