Back to Blog

Binary Search and Its Variants

Mastering the classic binary search algorithm and its powerful variants: lower bound, upper bound, and searching rotated arrays.

2021-04-03
Share
Algorithmsbinary-searchsearching

Terminology

  • Binary search: an algorithm that finds a target value in a sorted array by repeatedly halving the search space. Each comparison eliminates half of the remaining elements.
  • Search space: the range of indices (or values) that could still contain the answer. Binary search shrinks this by half on every iteration.
  • Loop invariant: a condition that is true before and after every iteration of a loop. For binary search, the invariant is that the target (if present) lies within the current bounds.
  • Lower bound: the smallest index $i$ such that $A[i] \geq \text{target}$. Equivalent to "where would this element be inserted to keep the array sorted?"
  • Upper bound: the smallest index $i$ such that $A[i] > \text{target}$. Combined with lower bound, this gives the range of all occurrences of a value.
  • Rotated sorted array: a sorted array that has been cyclically shifted, so the smallest element is no longer at index 0. Example: [4, 5, 6, 1, 2, 3].
  • Off-by-one error: a common bug in binary search where the boundary conditions (inclusive vs exclusive, rounding direction) cause the algorithm to miss the target or loop forever.
  • Monotonic predicate: a boolean function over a sorted domain that transitions from false to true (or true to false) exactly once. Binary search can find this transition point.

What & Why

Binary search is one of the most fundamental algorithms in computer science. The idea is simple: if the data is sorted, you can find any element in O(\log n) time by checking the middle element and eliminating half the search space.

Why is this so important? Because \log n grows incredibly slowly. Searching a billion elements takes at most 30 comparisons. This makes binary search the backbone of database indexing, autocomplete systems, and any scenario where you need fast lookups in ordered data.

But the classic "find this exact value" version is just the beginning. The real power of binary search comes from its variants:

  • Lower bound: find the first position where a value could be inserted
  • Upper bound: find the position just past the last occurrence of a value
  • Predicate search: find the transition point of any monotonic true/false function
  • Rotated array search: handle sorted arrays that have been cyclically shifted

Getting binary search right is notoriously tricky. Off-by-one errors in the bounds, the midpoint calculation, and the termination condition account for most bugs. This post covers each variant with precise loop invariants.

How It Works

Classic Binary Search

Maintain two pointers, low and high, representing the inclusive search range. Compute the midpoint, compare with the target, and narrow the range.

Binary Search for target = 23 Step 1: 2 5 8 12 16 23 38 42 56 low mid high 23 > 16, go right Step 2: 2 5 8 12 16 23 38 42 56 low mid high 23 < 38, go left Step 3: 23 found 23 = 23, found at index 5

Lower Bound and Upper Bound

Lower bound finds the leftmost position where the target could be inserted while keeping the array sorted. Upper bound finds the position just past the rightmost occurrence. Together, they define the range [\text{lower}, \text{upper}) containing all copies of the target.

Rotated Array Search

A rotated sorted array has two sorted halves. At each step, determine which half is sorted (compare A[low] with A[mid]), then check if the target falls within that sorted half. This lets you eliminate half the array each iteration, maintaining O(\log n).

Complexity Analysis

Variant Time Space Prerequisite
Classic binary search $O(\log n)$ $O(1)$ Sorted array
Lower bound $O(\log n)$ $O(1)$ Sorted array
Upper bound $O(\log n)$ $O(1)$ Sorted array
Rotated array search $O(\log n)$ $O(1)$ Rotated sorted array

Why O(\log n)?

Each iteration cuts the search space in half. Starting with $n$ elements:

$n \to \frac{n}{2} \to \frac{n}{4} \to \cdots \to 1$

The number of halvings until we reach 1 element is:

$\log_2 n$

Each halving does $O(1)$ work (one comparison, one midpoint calculation), so the total is $O(\log n)$.

Comparison with Linear Search

For perspective on how powerful $O(\log n)$ is:

$n = 10^9 \implies \log_2 n \approx 30$

Linear search would need up to a billion comparisons. Binary search needs at most 30.

Implementation

Classic Binary Search

ALGORITHM BinarySearch(A, n, target)
  INPUT: sorted array A of size n, target value
  OUTPUT: index of target, or -1 if not found

  low = 0
  high = n - 1

  WHILE low <= high DO
    mid = low + (high - low) / 2    // avoids integer overflow

    IF A[mid] = target THEN
      RETURN mid
    ELSE IF A[mid] < target THEN
      low = mid + 1
    ELSE
      high = mid - 1
    END IF
  END WHILE

  RETURN -1
END

Lower Bound (First Position >= Target)

ALGORITHM LowerBound(A, n, target)
  INPUT: sorted array A of size n, target value
  OUTPUT: smallest index i where A[i] >= target (or n if no such index)

  low = 0
  high = n

  WHILE low < high DO
    mid = low + (high - low) / 2

    IF A[mid] < target THEN
      low = mid + 1
    ELSE
      high = mid
    END IF
  END WHILE

  RETURN low
END

Upper Bound (First Position > Target)

ALGORITHM UpperBound(A, n, target)
  INPUT: sorted array A of size n, target value
  OUTPUT: smallest index i where A[i] > target (or n if no such index)

  low = 0
  high = n

  WHILE low < high DO
    mid = low + (high - low) / 2

    IF A[mid] <= target THEN
      low = mid + 1
    ELSE
      high = mid
    END IF
  END WHILE

  RETURN low
END

Search in Rotated Sorted Array

ALGORITHM SearchRotated(A, n, target)
  INPUT: rotated sorted array A of size n (no duplicates), target value
  OUTPUT: index of target, or -1 if not found

  low = 0
  high = n - 1

  WHILE low <= high DO
    mid = low + (high - low) / 2

    IF A[mid] = target THEN
      RETURN mid
    END IF

    // Determine which half is sorted
    IF A[low] <= A[mid] THEN
      // Left half is sorted
      IF A[low] <= target AND target < A[mid] THEN
        high = mid - 1
      ELSE
        low = mid + 1
      END IF
    ELSE
      // Right half is sorted
      IF A[mid] < target AND target <= A[high] THEN
        low = mid + 1
      ELSE
        high = mid - 1
      END IF
    END IF
  END WHILE

  RETURN -1
END

Real-World Applications

  • Database indexing: B-tree lookups use binary search within each node to find the correct child pointer or key.
  • Git bisect: finds the commit that introduced a bug by binary searching through commit history, treating "bug present" as a monotonic predicate.
  • Autocomplete and spell-check: prefix lookups in sorted dictionaries use lower/upper bound to find all words starting with a given prefix.
  • System libraries: every standard library provides binary search (C's bsearch, Python's bisect, Java's Arrays.binarySearch).
  • Competitive programming: "binary search the answer" is a common technique where you binary search over possible answer values and check feasibility with a monotonic predicate.

Key Takeaways

  • Binary search achieves $O(\log n)$ by halving the search space on every iteration, requiring sorted input
  • The midpoint formula low + (high - low) / 2 prevents integer overflow, a subtle but important detail
  • Lower bound and upper bound are more versatile than exact-match search: they handle duplicates, range queries, and insertion points
  • Rotated array search extends binary search by identifying which half is sorted at each step
  • The general pattern applies beyond arrays: any monotonic predicate over an ordered domain can be binary searched