Heaps and Priority Queues: Always Know the Extreme
How binary heaps maintain the min or max element at the top in O(log n) time, powering priority queues, heap sort, and scheduling systems.
Terminology
- Heap
- A complete binary tree that satisfies the heap property: every parent node is ordered relative to its children (smaller in a min-heap, larger in a max-heap).
- Min-Heap
- A heap where every parent's key is less than or equal to its children's keys. The root holds the minimum element.
- Max-Heap
- A heap where every parent's key is greater than or equal to its children's keys. The root holds the maximum element.
- Complete Binary Tree
- A binary tree where every level is fully filled except possibly the last, which is filled from left to right. This property allows array-based storage with no gaps.
- Priority Queue
- An abstract data type that supports inserting elements with priorities and extracting the element with the highest (or lowest) priority. Binary heaps are the most common implementation.
- Heapify
- The process of restoring the heap property after an element is added or removed. "Sift up" fixes after insertion; "sift down" fixes after extraction.
- Heap Sort
- A comparison-based sorting algorithm that builds a max-heap from the input and repeatedly extracts the maximum to produce sorted output. Runs in $O(n \log n)$ time with $O(1)$ extra space.
What & Why
A heap is a tree-based data structure that keeps the smallest (or largest) element always accessible at the root. It supports insertion and extraction of the extreme element in O(\log n) time, making it the go-to implementation for priority queues.
Why does this matter? Any time you need to repeatedly find the "most important" element from a changing collection, you need a priority queue. Operating system schedulers pick the highest-priority process. Dijkstra's shortest-path algorithm extracts the closest unvisited node. Event-driven simulations process the next event in time order. All of these rely on heaps.
Key properties:
O(1)access to the min or max: the root always holds the extreme elementO(\log n)insert and extract: maintained by sifting elements up or down the tree- Array-backed: the complete binary tree maps perfectly to an array with no pointer overhead
O(n)build time: a heap can be constructed from an unsorted array in linear time using bottom-up heapify
How It Works
Array Representation
Because a heap is a complete binary tree, it maps directly to an array with no wasted space. For a node at index $i$ (0-based):
- Left child: $2i + 1$
- Right child: $2i + 2$
- Parent: $\lfloor (i - 1) / 2 \rfloor$
Every parent is smaller than both children, satisfying the min-heap property. The minimum (1) is always at index 0.
Insert (Sift Up)
- Append the new element at the end of the array (next available leaf position)
- Compare it with its parent; if it violates the heap property, swap them
- Repeat step 2, moving up the tree, until the property is restored or the root is reached
This takes at most $O(\log n)$ swaps, one per level of the tree.
Extract Min/Max (Sift Down)
- Save the root element (the min or max)
- Move the last element in the array to the root position
- Compare the new root with its children; swap with the smaller child (min-heap) or larger child (max-heap)
- Repeat step 3, moving down the tree, until the property is restored or a leaf is reached
This also takes at most $O(\log n)$ swaps.
Build Heap (Bottom-Up Heapify)
To convert an unsorted array into a heap, call sift-down on every non-leaf node, starting from the last non-leaf and working toward the root. Despite calling sift-down $O(n/2)$ times, the total work is $O(n)$ because most nodes are near the bottom and sift down only a few levels.
The formal analysis sums the work at each level:
Complexity Analysis
| Operation | Time | Notes |
|---|---|---|
| Find min/max | $O(1)$ | Always at the root (index 0) |
| Insert | $O(\log n)$ | Sift up from leaf to root |
| Extract min/max | $O(\log n)$ | Sift down from root to leaf |
| Build heap | $O(n)$ | Bottom-up heapify |
| Heap sort | $O(n \log n)$ | Build heap + $n$ extractions |
| Space | $O(n)$ | Array storage, no pointers needed |
Heap Sort Analysis
Heap sort works in two phases:
- Build a max-heap from the input array: $O(n)$
- Repeatedly extract the max and place it at the end of the array: $n$ extractions at $O(\log n)$ each
Heap sort is in-place ($O(1)$ extra space) and always $O(n \log n)$, but it is not stable and has poor cache locality compared to quicksort.
Implementation
Min-Heap (Pseudocode)
STRUCTURE MinHeap:
data ← empty array
FUNCTION peek(heap):
RETURN heap.data[0]
FUNCTION insert(heap, val):
APPEND val TO heap.data
siftUp(heap, length(heap.data) - 1)
FUNCTION extractMin(heap):
IF length(heap.data) = 0 THEN RETURN UNDEFINED
min ← heap.data[0]
last ← REMOVE last element FROM heap.data
IF length(heap.data) > 0 THEN
heap.data[0] ← last
siftDown(heap, 0)
RETURN min
FUNCTION siftUp(heap, i):
WHILE i > 0:
parent ← floor((i - 1) / 2)
IF heap.data[i] >= heap.data[parent] THEN BREAK
SWAP heap.data[i] AND heap.data[parent]
i ← parent
FUNCTION siftDown(heap, i):
n ← length(heap.data)
LOOP:
smallest ← i
left ← 2 * i + 1
right ← 2 * i + 2
IF left < n AND heap.data[left] < heap.data[smallest] THEN smallest ← left
IF right < n AND heap.data[right] < heap.data[smallest] THEN smallest ← right
IF smallest = i THEN BREAK
SWAP heap.data[i] AND heap.data[smallest]
i ← smallest
Heap Sort (Pseudocode)
FUNCTION heapSort(arr):
n ← length(arr)
// Build max-heap (bottom-up)
FOR i ← floor(n / 2) - 1 DOWNTO 0:
siftDownMax(arr, i, n)
// Extract max repeatedly
FOR end ← n - 1 DOWNTO 1:
SWAP arr[0] AND arr[end]
siftDownMax(arr, 0, end)
FUNCTION siftDownMax(arr, i, size):
LOOP:
largest ← i
left ← 2 * i + 1
right ← 2 * i + 2
IF left < size AND arr[left] > arr[largest] THEN largest ← left
IF right < size AND arr[right] > arr[largest] THEN largest ← right
IF largest = i THEN BREAK
SWAP arr[i] AND arr[largest]
i ← largest
Real-World Applications
- OS process scheduling: priority queues determine which process runs next based on priority levels
- Dijkstra's algorithm: a min-heap extracts the nearest unvisited vertex in $O(\log n)$, enabling efficient shortest-path computation
- Event-driven simulation: events are processed in timestamp order using a min-heap as the event queue
- Median maintenance: two heaps (a max-heap for the lower half and a min-heap for the upper half) track the running median of a data stream
- Merge K sorted lists: a min-heap of size $K$ efficiently merges $K$ sorted sequences in $O(N \log K)$ total time
- Huffman coding: builds an optimal prefix code by repeatedly extracting the two lowest-frequency nodes from a min-heap
Key Takeaways
- A binary heap is a complete binary tree stored in an array, with parent-child relationships computed by index arithmetic
- Min-heaps keep the smallest element at the root; max-heaps keep the largest
- Insert and extract operations are $O(\log n)$ via sift-up and sift-down
- Building a heap from an unsorted array is $O(n)$ using bottom-up heapify, not $O(n \log n)$
- Heap sort achieves $O(n \log n)$ worst-case with $O(1)$ extra space, but quicksort is usually faster in practice due to better cache behavior