Binary Search and Its Variants
Mastering the classic binary search algorithm and its powerful variants: lower bound, upper bound, and searching rotated arrays.
Terminology
- Binary search: an algorithm that finds a target value in a sorted array by repeatedly halving the search space. Each comparison eliminates half of the remaining elements.
- Search space: the range of indices (or values) that could still contain the answer. Binary search shrinks this by half on every iteration.
- Loop invariant: a condition that is true before and after every iteration of a loop. For binary search, the invariant is that the target (if present) lies within the current bounds.
- Lower bound: the smallest index $i$ such that $A[i] \geq \text{target}$. Equivalent to "where would this element be inserted to keep the array sorted?"
- Upper bound: the smallest index $i$ such that $A[i] > \text{target}$. Combined with lower bound, this gives the range of all occurrences of a value.
- Rotated sorted array: a sorted array that has been cyclically shifted, so the smallest element is no longer at index 0. Example: [4, 5, 6, 1, 2, 3].
- Off-by-one error: a common bug in binary search where the boundary conditions (inclusive vs exclusive, rounding direction) cause the algorithm to miss the target or loop forever.
- Monotonic predicate: a boolean function over a sorted domain that transitions from false to true (or true to false) exactly once. Binary search can find this transition point.
What & Why
Binary search is one of the most fundamental algorithms in computer science. The idea is simple: if the data is sorted, you can find any element in O(\log n) time by checking the middle element and eliminating half the search space.
Why is this so important? Because \log n grows incredibly slowly. Searching a billion elements takes at most 30 comparisons. This makes binary search the backbone of database indexing, autocomplete systems, and any scenario where you need fast lookups in ordered data.
But the classic "find this exact value" version is just the beginning. The real power of binary search comes from its variants:
- Lower bound: find the first position where a value could be inserted
- Upper bound: find the position just past the last occurrence of a value
- Predicate search: find the transition point of any monotonic true/false function
- Rotated array search: handle sorted arrays that have been cyclically shifted
Getting binary search right is notoriously tricky. Off-by-one errors in the bounds, the midpoint calculation, and the termination condition account for most bugs. This post covers each variant with precise loop invariants.
How It Works
Classic Binary Search
Maintain two pointers, low and high, representing the inclusive search range. Compute the midpoint, compare with the target, and narrow the range.
Lower Bound and Upper Bound
Lower bound finds the leftmost position where the target could be inserted while keeping the array sorted. Upper bound finds the position just past the rightmost occurrence. Together, they define the range [\text{lower}, \text{upper}) containing all copies of the target.
Rotated Array Search
A rotated sorted array has two sorted halves. At each step, determine which half is sorted (compare A[low] with A[mid]), then check if the target falls within that sorted half. This lets you eliminate half the array each iteration, maintaining O(\log n).
Complexity Analysis
| Variant | Time | Space | Prerequisite |
|---|---|---|---|
| Classic binary search | $O(\log n)$ | $O(1)$ | Sorted array |
| Lower bound | $O(\log n)$ | $O(1)$ | Sorted array |
| Upper bound | $O(\log n)$ | $O(1)$ | Sorted array |
| Rotated array search | $O(\log n)$ | $O(1)$ | Rotated sorted array |
Why O(\log n)?
Each iteration cuts the search space in half. Starting with $n$ elements:
The number of halvings until we reach 1 element is:
Each halving does $O(1)$ work (one comparison, one midpoint calculation), so the total is $O(\log n)$.
Comparison with Linear Search
For perspective on how powerful $O(\log n)$ is:
Linear search would need up to a billion comparisons. Binary search needs at most 30.
Implementation
Classic Binary Search
ALGORITHM BinarySearch(A, n, target)
INPUT: sorted array A of size n, target value
OUTPUT: index of target, or -1 if not found
low = 0
high = n - 1
WHILE low <= high DO
mid = low + (high - low) / 2 // avoids integer overflow
IF A[mid] = target THEN
RETURN mid
ELSE IF A[mid] < target THEN
low = mid + 1
ELSE
high = mid - 1
END IF
END WHILE
RETURN -1
END
Lower Bound (First Position >= Target)
ALGORITHM LowerBound(A, n, target)
INPUT: sorted array A of size n, target value
OUTPUT: smallest index i where A[i] >= target (or n if no such index)
low = 0
high = n
WHILE low < high DO
mid = low + (high - low) / 2
IF A[mid] < target THEN
low = mid + 1
ELSE
high = mid
END IF
END WHILE
RETURN low
END
Upper Bound (First Position > Target)
ALGORITHM UpperBound(A, n, target)
INPUT: sorted array A of size n, target value
OUTPUT: smallest index i where A[i] > target (or n if no such index)
low = 0
high = n
WHILE low < high DO
mid = low + (high - low) / 2
IF A[mid] <= target THEN
low = mid + 1
ELSE
high = mid
END IF
END WHILE
RETURN low
END
Search in Rotated Sorted Array
ALGORITHM SearchRotated(A, n, target)
INPUT: rotated sorted array A of size n (no duplicates), target value
OUTPUT: index of target, or -1 if not found
low = 0
high = n - 1
WHILE low <= high DO
mid = low + (high - low) / 2
IF A[mid] = target THEN
RETURN mid
END IF
// Determine which half is sorted
IF A[low] <= A[mid] THEN
// Left half is sorted
IF A[low] <= target AND target < A[mid] THEN
high = mid - 1
ELSE
low = mid + 1
END IF
ELSE
// Right half is sorted
IF A[mid] < target AND target <= A[high] THEN
low = mid + 1
ELSE
high = mid - 1
END IF
END IF
END WHILE
RETURN -1
END
Real-World Applications
- Database indexing: B-tree lookups use binary search within each node to find the correct child pointer or key.
- Git bisect: finds the commit that introduced a bug by binary searching through commit history, treating "bug present" as a monotonic predicate.
- Autocomplete and spell-check: prefix lookups in sorted dictionaries use lower/upper bound to find all words starting with a given prefix.
- System libraries: every standard library provides binary search (C's
bsearch, Python'sbisect, Java'sArrays.binarySearch). - Competitive programming: "binary search the answer" is a common technique where you binary search over possible answer values and check feasibility with a monotonic predicate.
Key Takeaways
- Binary search achieves $O(\log n)$ by halving the search space on every iteration, requiring sorted input
- The midpoint formula
low + (high - low) / 2prevents integer overflow, a subtle but important detail - Lower bound and upper bound are more versatile than exact-match search: they handle duplicates, range queries, and insertion points
- Rotated array search extends binary search by identifying which half is sorted at each step
- The general pattern applies beyond arrays: any monotonic predicate over an ordered domain can be binary searched