Heaps and Priority Queues: Always Know the Extreme
How binary heaps maintain the min or max element at the top in O(log n) time, powering priority queues, heap sort, and scheduling systems.
Terminology
- Heap
- A complete binary tree that satisfies the heap property: every parent node is ordered relative to its children (smaller in a min-heap, larger in a max-heap).
- Min-Heap
- A heap where every parent's key is less than or equal to its children's keys. The root holds the minimum element.
- Max-Heap
- A heap where every parent's key is greater than or equal to its children's keys. The root holds the maximum element.
- Complete Binary Tree
- A binary tree where every level is fully filled except possibly the last, which is filled from left to right. This property allows array-based storage with no gaps.
- Priority Queue
- An abstract data type that supports inserting elements with priorities and extracting the element with the highest (or lowest) priority. Binary heaps are the most common implementation.
- Heapify
- The process of restoring the heap property after an element is added or removed. "Sift up" fixes after insertion; "sift down" fixes after extraction.
- Heap Sort
- A comparison-based sorting algorithm that builds a max-heap from the input and repeatedly extracts the maximum to produce sorted output. Runs in $O(n \log n)$ time with $O(1)$ extra space.
What & Why
A heap is a tree-based data structure that keeps the smallest (or largest) element always accessible at the root. It supports insertion and extraction of the extreme element in $O(\log n)$ time, making it the go-to implementation for priority queues.
Why does this matter? Any time you need to repeatedly find the "most important" element from a changing collection, you need a priority queue. Operating system schedulers pick the highest-priority process. Dijkstra's shortest-path algorithm extracts the closest unvisited node. Event-driven simulations process the next event in time order. All of these rely on heaps.
Key properties:
- $O(1)$ access to the min or max: the root always holds the extreme element
- $O(\log n)$ insert and extract: maintained by sifting elements up or down the tree
- Array-backed: the complete binary tree maps perfectly to an array with no pointer overhead
- $O(n)$ build time: a heap can be constructed from an unsorted array in linear time using bottom-up heapify
How It Works
Array Representation
Because a heap is a complete binary tree, it maps directly to an array with no wasted space. For a node at index $i$ (0-based):
- Left child: $2i + 1$
- Right child: $2i + 2$
- Parent: $\lfloor (i - 1) / 2 \rfloor$
Every parent is smaller than both children, satisfying the min-heap property. The minimum (1) is always at index 0.
Insert (Sift Up)
- Append the new element at the end of the array (next available leaf position)
- Compare it with its parent; if it violates the heap property, swap them
- Repeat step 2, moving up the tree, until the property is restored or the root is reached
This takes at most $O(\log n)$ swaps, one per level of the tree.
Extract Min/Max (Sift Down)
- Save the root element (the min or max)
- Move the last element in the array to the root position
- Compare the new root with its children; swap with the smaller child (min-heap) or larger child (max-heap)
- Repeat step 3, moving down the tree, until the property is restored or a leaf is reached
This also takes at most $O(\log n)$ swaps.
Build Heap (Bottom-Up Heapify)
To convert an unsorted array into a heap, call sift-down on every non-leaf node, starting from the last non-leaf and working toward the root. Despite calling sift-down $O(n/2)$ times, the total work is $O(n)$ because most nodes are near the bottom and sift down only a few levels.
The formal analysis sums the work at each level:
Complexity Analysis
| Operation | Time | Notes |
|---|---|---|
| Find min/max | $O(1)$ | Always at the root (index 0) |
| Insert | $O(\log n)$ | Sift up from leaf to root |
| Extract min/max | $O(\log n)$ | Sift down from root to leaf |
| Build heap | $O(n)$ | Bottom-up heapify |
| Heap sort | $O(n \log n)$ | Build heap + $n$ extractions |
| Space | $O(n)$ | Array storage, no pointers needed |
Heap Sort Analysis
Heap sort works in two phases:
- Build a max-heap from the input array: $O(n)$
- Repeatedly extract the max and place it at the end of the array: $n$ extractions at $O(\log n)$ each
Heap sort is in-place ($O(1)$ extra space) and always $O(n \log n)$, but it is not stable and has poor cache locality compared to quicksort.
Implementation
Min-Heap (Pseudocode)
STRUCTURE MinHeap:
data ← empty array
FUNCTION peek(heap):
RETURN heap.data[0]
FUNCTION insert(heap, val):
APPEND val TO heap.data
siftUp(heap, length(heap.data) - 1)
FUNCTION extractMin(heap):
IF length(heap.data) = 0 THEN RETURN UNDEFINED
min ← heap.data[0]
last ← REMOVE last element FROM heap.data
IF length(heap.data) > 0 THEN
heap.data[0] ← last
siftDown(heap, 0)
RETURN min
FUNCTION siftUp(heap, i):
WHILE i > 0:
parent ← floor((i - 1) / 2)
IF heap.data[i] >= heap.data[parent] THEN BREAK
SWAP heap.data[i] AND heap.data[parent]
i ← parent
FUNCTION siftDown(heap, i):
n ← length(heap.data)
LOOP:
smallest ← i
left ← 2 * i + 1
right ← 2 * i + 2
IF left < n AND heap.data[left] < heap.data[smallest] THEN smallest ← left
IF right < n AND heap.data[right] < heap.data[smallest] THEN smallest ← right
IF smallest = i THEN BREAK
SWAP heap.data[i] AND heap.data[smallest]
i ← smallest
Heap Sort (Pseudocode)
FUNCTION heapSort(arr):
n ← length(arr)
// Build max-heap (bottom-up)
FOR i ← floor(n / 2) - 1 DOWNTO 0:
siftDownMax(arr, i, n)
// Extract max repeatedly
FOR end ← n - 1 DOWNTO 1:
SWAP arr[0] AND arr[end]
siftDownMax(arr, 0, end)
FUNCTION siftDownMax(arr, i, size):
LOOP:
largest ← i
left ← 2 * i + 1
right ← 2 * i + 2
IF left < size AND arr[left] > arr[largest] THEN largest ← left
IF right < size AND arr[right] > arr[largest] THEN largest ← right
IF largest = i THEN BREAK
SWAP arr[i] AND arr[largest]
i ← largest
Real-World Applications
- OS process scheduling: priority queues determine which process runs next based on priority levels
- Dijkstra's algorithm: a min-heap extracts the nearest unvisited vertex in $O(\log n)$, enabling efficient shortest-path computation
- Event-driven simulation: events are processed in timestamp order using a min-heap as the event queue
- Median maintenance: two heaps (a max-heap for the lower half and a min-heap for the upper half) track the running median of a data stream
- Merge K sorted lists: a min-heap of size $K$ efficiently merges $K$ sorted sequences in $O(N \log K)$ total time
- Huffman coding: builds an optimal prefix code by repeatedly extracting the two lowest-frequency nodes from a min-heap
Key Takeaways
- A binary heap is a complete binary tree stored in an array, with parent-child relationships computed by index arithmetic
- Min-heaps keep the smallest element at the root; max-heaps keep the largest
- Insert and extract operations are $O(\log n)$ via sift-up and sift-down
- Building a heap from an unsorted array is $O(n)$ using bottom-up heapify, not $O(n \log n)$
- Heap sort achieves $O(n \log n)$ worst-case with $O(1)$ extra space, but quicksort is usually faster in practice due to better cache behavior
Read More
2021-01-15
Graphs: Representation, BFS, and DFS
How to represent graphs with adjacency lists and matrices, and how breadth-first and depth-first search explore them systematically.
2021-01-17
Recursion: Solving Problems by Solving Smaller Problems
How recursive functions work, why they need base cases, and how to think about call stacks, recurrence relations, and memoization.
2021-01-19
BFS and DFS: The Two Ways to Explore a Graph
A deep dive into breadth-first search and depth-first search, their properties, trade-offs, and the problems each one solves best.