Back to Blog

Amortized Analysis: The True Cost of Operations

Understanding amortized analysis through the aggregate method, accounting method, and potential method, with dynamic array and splay tree examples showing why worst-case per-operation analysis can be misleading.

2025-10-23
Share
Theory of Computationamortized-analysispotential-method

Terminology

Term Definition
Amortized analysis A technique for analyzing the average cost per operation over a worst-case sequence of operations, giving a tighter bound than worst-case per-operation analysis
Aggregate method Compute the total cost of $n$ operations, then divide by $n$ to get the amortized cost per operation
Accounting method Assign each operation an amortized cost (possibly different from actual cost); cheap operations "save" credit that expensive operations "spend"
Potential method Define a potential function $\Phi$ on the data structure's state; the amortized cost is the actual cost plus the change in potential
Potential function ($\Phi$) A function mapping data structure states to non-negative real numbers, representing "stored energy" available for future expensive operations
Dynamic array An array that doubles its capacity when full; individual inserts are $O(n)$ worst case but $O(1)$ amortized
Splay tree A self-adjusting BST that moves accessed nodes to the root via rotations; $O(\log n)$ amortized per operation despite $O(n)$ worst case
Credit (accounting) Prepaid work stored during cheap operations that pays for future expensive operations; total credit must never go negative

What & Why

Worst-case analysis can be misleading. A dynamic array's push operation is O(n) in the worst case (when it needs to resize), but most pushes are O(1). If you analyze a sequence of n pushes using worst-case per-operation analysis, you get O(n^2). But the actual total cost is O(n), giving an amortized cost of O(1) per push.

Amortized analysis gives you the true cost of a sequence of operations. It does not use probability or averages over random inputs. It guarantees that for any sequence of n operations, the total cost is bounded, and the per-operation average of that bound is the amortized cost.

This matters whenever you use data structures with occasional expensive operations: dynamic arrays, hash tables with rehashing, splay trees, Fibonacci heaps, or union-find with path compression. The amortized cost tells you the real performance story.

How It Works

Method 1: Aggregate Method

Count the total cost of n operations, then divide by n.

Dynamic array example: Start with capacity 1. When full, double the capacity and copy all elements.

For n pushes, resizing happens at pushes 1, 2, 4, 8, 16, ..., costing 1 + 2 + 4 + 8 + ... + n copies. This geometric series sums to at most 2n. Adding the n individual push costs: total \le 3n. Amortized cost: 3n / n = O(1).

Dynamic Array: Cost per Push Operation Operation number Cost 1 2 3 4 5 6 7 8 9 ... 16 amortized = 3 O(1) push resize + copy

Method 2: Accounting Method

Assign each operation an amortized cost. Cheap operations pay more than their actual cost, banking credit. Expensive operations use that credit.

Dynamic array: Charge each push an amortized cost of 3 units:

  • 1 unit for the push itself
  • 2 units saved as credit (1 for copying this element later, 1 for copying a matching old element)

When a resize happens at capacity k, we need to copy k elements. But we have k/2 elements that each saved 2 credits (they were inserted since the last resize), giving k credits total. The resize is fully paid for.

Method 3: Potential Method

Define a potential function \Phi(D_i) on the state of the data structure after operation i. The amortized cost of operation i is:

$\hat{c}_i = c_i + \Phi(D_i) - \Phi(D_{i-1})$

The total amortized cost telescopes:

$\sum_{i=1}^{n} \hat{c}_i = \sum_{i=1}^{n} c_i + \Phi(D_n) - \Phi(D_0)$

If \Phi(D_n) \ge \Phi(D_0) (typically both start at 0), then the total amortized cost is an upper bound on the total actual cost.

Dynamic array potential: Let \Phi = 2 \cdot \text{size} - \text{capacity}.

  • After a push without resize: \Phi increases by 2 (size goes up by 1). Amortized cost: 1 + 2 = 3.
  • After a push with resize from capacity k to 2k: actual cost is k + 1 (copy k elements + 1 push). \Phi changes from 2k - k = k to 2(k+1) - 2k = 2. Change in \Phi: 2 - k. Amortized cost: (k+1) + (2 - k) = 3.

Both cases give amortized cost 3. Consistent with the other methods.

Three Methods, Same Answer Aggregate Total cost / n 3n / n = 3 O(1) amortized Accounting Charge 3 per push Bank 2 as credit O(1) amortized Potential $\Phi$ = 2*size - cap cost + delta $\Phi$ = 3 O(1) amortized

Complexity Analysis

Data Structure Operation Worst Case Amortized
Dynamic array Push $O(n)$ $O(1)$
Hash table (with rehash) Insert $O(n)$ $O(1)$
Splay tree Search/Insert/Delete $O(n)$ $O(\log n)$
Fibonacci heap Decrease-key $O(n)$ $O(1)$
Union-Find (path compression) Find/Union $O(\log n)$ $O(\alpha(n)) \approx O(1)$

Implementation

ALGORITHM DynamicArrayPush(array, element)
INPUT: array: dynamic array with size and capacity, element: value to add
OUTPUT: updated array
BEGIN
  IF array.size = array.capacity THEN
    newCapacity <- array.capacity * 2
    newData <- ALLOCATE array of size newCapacity
    FOR i <- 0 TO array.size - 1 DO
      newData[i] <- array.data[i]
    END FOR
    array.data <- newData
    array.capacity <- newCapacity
  END IF

  array.data[array.size] <- element
  array.size <- array.size + 1
END

ALGORITHM AmortizedCostAggregate(operations, n)
INPUT: operations: sequence of n operations, n: count
OUTPUT: amortized cost per operation
BEGIN
  totalCost <- 0
  FOR i <- 1 TO n DO
    totalCost <- totalCost + ActualCost(operations[i])
  END FOR
  RETURN totalCost / n
END

ALGORITHM PotentialMethodCost(actualCost, phiBefore, phiAfter)
INPUT: actualCost: real cost of operation, phiBefore: potential before, phiAfter: potential after
OUTPUT: amortized cost of this operation
BEGIN
  RETURN actualCost + (phiAfter - phiBefore)
END

ALGORITHM MultiPopStack_Amortized(stack, k)
INPUT: stack: stack with n elements, k: number to pop
OUTPUT: popped elements
BEGIN
  popped <- empty list
  count <- MIN(k, stack.size)
  FOR i <- 1 TO count DO
    popped <- popped + [POP(stack)]
  END FOR
  RETURN popped
END

Real-World Applications

  • Dynamic arrays everywhere: JavaScript arrays, Python lists, Java ArrayLists, and C++ vectors all use doubling strategies with $O(1)$ amortized push, justified by amortized analysis
  • Hash table rehashing: when a hash table's load factor exceeds a threshold, it rehashes all entries into a larger table; amortized analysis proves this keeps insert at $O(1)$ average
  • Splay trees in caching: splay trees provide $O(\log n)$ amortized access and naturally move frequently accessed items to the root, making them effective for cache implementations
  • Garbage collection: generational garbage collectors amortize the cost of collection over many allocations; most collections are cheap (young generation), with rare expensive full collections
  • Network buffer management: network stacks use amortized analysis to justify buffer allocation strategies where occasional large allocations are spread across many small packet arrivals

Key Takeaways

  • Amortized analysis gives the true average cost per operation over a worst-case sequence, not a probabilistic average
  • Three methods (aggregate, accounting, potential) always give the same answer; choose whichever is most natural for the problem
  • The potential method is the most general: define $\Phi$ so that cheap operations increase potential and expensive operations decrease it
  • Dynamic array push is the canonical example: $O(n)$ worst case but $O(1)$ amortized, because resizes are exponentially rare
  • Amortized analysis is essential for understanding the real performance of hash tables, splay trees, Fibonacci heaps, and union-find