Prefix Sums: Range Queries in Constant Time
Master the prefix sum technique for O(1) range sum queries, subarray sum equals K with hash maps, and extending the pattern to 2D grids.
Terminology
| Term | Definition |
|---|---|
| Prefix sum | An array $P$ where $P[i] = A[0] + A[1] + \cdots + A[i-1]$. Precomputed in $O(n)$, it enables $O(1)$ range sum queries. |
| Range sum query | Computing the sum of elements from index $l$ to $r$: $\text{sum}(l, r) = P[r+1] - P[l]$. |
| Subarray sum equals K | Finding the count (or existence) of contiguous subarrays whose elements sum to a target $K$. Solved in $O(n)$ by combining prefix sums with a hash map. |
What & Why
Many problems ask for the sum of a contiguous subarray. The brute-force approach recomputes the sum from scratch for each query, taking O(n) per query. With q queries, that is O(nq) total.
Prefix sums precompute a cumulative sum array in O(n) time, then answer each range sum query in O(1). The total becomes O(n + q), which is a massive improvement when q is large.
The technique also unlocks a powerful pattern for "subarray sum equals K" problems. By storing prefix sums in a hash map, you can check in O(1) whether any earlier prefix sum differs from the current one by exactly K. This converts an O(n^2) problem into O(n).
The 2D extension handles rectangular region sums in matrices, which appears in image processing, game boards, and geographic data queries.
How It Works
1D Prefix Sum Construction and Query
Build the prefix array once, then answer any range query by subtraction.
Subarray Sum Equals K with Hash Map
Walk through the prefix sums. At each index, check if P[i] - K has been seen before. If so, there exists a subarray ending at i with sum K.
Complexity Analysis
| Operation | Time | Space |
|---|---|---|
| Build 1D prefix sum | $O(n)$ | $O(n)$ |
| 1D range sum query | $O(1)$ | - |
| Subarray sum = K (hash map) | $O(n)$ | $O(n)$ |
| Build 2D prefix sum | $O(mn)$ | $O(mn)$ |
| 2D rectangular region query | $O(1)$ | - |
The key formula for 1D range sum:
For 2D, the inclusion-exclusion formula for the rectangle from $(r_1, c_1)$ to $(r_2, c_2)$:
Implementation
1D Prefix Sum: Range Sum Query
ALGORITHM BuildPrefixSum(A, n)
INPUT: array A of size n
OUTPUT: prefix sum array P of size n + 1
P[0] = 0
FOR i = 0 TO n - 1 DO
P[i + 1] = P[i] + A[i]
END FOR
RETURN P
END
ALGORITHM RangeSum(P, l, r)
INPUT: prefix array P, indices l and r (inclusive)
OUTPUT: sum of A[l..r]
RETURN P[r + 1] - P[l]
END
Subarray Sum Equals K
ALGORITHM CountSubarraySumK(A, n, K)
INPUT: array A of size n, target sum K
OUTPUT: number of contiguous subarrays with sum = K
prefixCount = hash map with {0: 1}
currentSum = 0
count = 0
FOR i = 0 TO n - 1 DO
currentSum = currentSum + A[i]
IF (currentSum - K) EXISTS IN prefixCount THEN
count = count + prefixCount[currentSum - K]
END IF
prefixCount[currentSum] = prefixCount[currentSum] + 1
END FOR
RETURN count
END
2D Prefix Sum
ALGORITHM Build2DPrefixSum(M, rows, cols)
INPUT: matrix M of size rows x cols
OUTPUT: prefix sum matrix P of size (rows+1) x (cols+1)
Initialize P with all zeros
FOR i = 1 TO rows DO
FOR j = 1 TO cols DO
P[i][j] = M[i-1][j-1] + P[i-1][j] + P[i][j-1] - P[i-1][j-1]
END FOR
END FOR
RETURN P
END
ALGORITHM RegionSum(P, r1, c1, r2, c2)
INPUT: 2D prefix matrix P, top-left (r1,c1), bottom-right (r2,c2)
OUTPUT: sum of elements in the rectangle
RETURN P[r2+1][c2+1] - P[r1][c2+1] - P[r2+1][c1] + P[r1][c1]
END
Real-World Applications
- Database aggregation: cumulative sum columns enable instant range queries on time-series data without scanning rows.
- Image processing: integral images (2D prefix sums) power fast box blur, feature detection (Haar cascades in face detection), and region statistics.
- Network monitoring: prefix sums over packet counts enable instant computation of traffic volume over any time interval.
- Spreadsheet formulas: SUM(A1:A100) in spreadsheets is internally optimized using prefix sum caching for recalculation.
- Competitive programming: prefix sums are a building block for difference arrays, Fenwick trees, and segment trees.
Key Takeaways
- Prefix sums trade $O(n)$ preprocessing space for $O(1)$ range sum queries, a classic time-space tradeoff
- The "subarray sum equals K" pattern combines prefix sums with a hash map to find matching subarrays in $O(n)$
- 2D prefix sums use inclusion-exclusion to answer rectangular region queries in $O(1)$ after $O(mn)$ preprocessing
- The off-by-one convention ($P[0] = 0$, array size $n+1$) eliminates boundary special cases
- Prefix sums only work for immutable data. For arrays with updates, use a Fenwick tree or segment tree instead