Arrays: The Foundation of Everything
Understanding contiguous memory, indexing, and dynamic arrays: the building block behind nearly every data structure.
What & Why
An array is the simplest and most fundamental data structure in computer science: a contiguous block of memory that stores elements of the same type at fixed-size intervals. Every element lives right next to its neighbor in memory, which means the CPU can jump to any element instantly using simple arithmetic.
Why does this matter? Arrays are the backbone of nearly every other data structure. Stacks, queues, heaps, hash tables, and even strings are typically implemented on top of arrays. Understanding how arrays work at the memory level gives you intuition for performance trade-offs across all of CS.
Key properties:
- Fixed-size allocation: a static array has a set capacity at creation time
- O(1) random access: any element is reachable by index in constant time
- Cache-friendly: contiguous memory means sequential reads hit the CPU cache
- Costly insertions/deletions: shifting elements takes O(n) time in the worst case
How It Works
Memory Layout
When you allocate an array of $n$ elements of type $T$, the system reserves a contiguous block of $n \times \text{sizeof}(T)$ bytes. Each element's address is computed directly from the base address:
This is why indexing is $O(1)$: it's just one multiplication and one addition, regardless of the array size.
Dynamic Arrays
Static arrays have a fixed capacity, but most real-world programs need arrays that grow. A dynamic array (like JavaScript's Array, Python's list, or Java's ArrayList) solves this by maintaining:
- An internal fixed-size buffer (the backing array)
- A
sizecounter tracking how many elements are actually stored - A
capacityrepresenting the total allocated slots
When size reaches capacity, the dynamic array doubles its capacity: it allocates a new buffer of $2 \times \text{capacity}$, copies all elements over, and frees the old buffer.
The doubling strategy is what gives push its amortized $O(1)$ cost. Most pushes are $O(1)$ (just write to the next slot), and the occasional $O(n)$ resize is spread across all previous cheap pushes.
Complexity Analysis
| Operation | Time | Space | Notes |
|---|---|---|---|
| Access by index | $O(1)$ | $O(1)$ | Direct address calculation |
| Search (unsorted) | $O(n)$ | $O(1)$ | Must scan every element |
| Search (sorted) | $O(\log n)$ | $O(1)$ | Binary search |
| Insert at end | $O(1)$ amortized | $O(n)$ | Resize doubles capacity |
| Insert at position $i$ | $O(n)$ | $O(1)$ | Shift elements right |
| Delete at position $i$ | $O(n)$ | $O(1)$ | Shift elements left |
| Delete at end | $O(1)$ | $O(1)$ | Just decrement size |
Amortized Analysis of Dynamic Array Push
Using the aggregate method: after $n$ pushes starting from capacity 1, the total number of element copies due to resizing is:
So the total work for $n$ pushes is at most $n + 2n = 3n$, giving an amortized cost of:
Space Overhead
A dynamic array with $n$ elements and doubling strategy uses at most $2n$ slots of memory. The space utilization is always at least 50% after the first resize:
Real-World Applications
- Database row storage: Fixed-size records stored in contiguous pages for fast sequential scans (e.g., PostgreSQL heap pages)
- Image processing: Pixel buffers are 2D arrays; contiguous layout enables SIMD vectorization
- Network buffers: Ring buffers (circular arrays) manage incoming packet data in OS kernels
- JavaScript engines: V8 uses "packed" arrays (dense, contiguous) for performance; falling back to dictionary mode only when arrays become sparse
- GPU computing: CUDA kernels operate on arrays in global memory; coalesced access patterns depend on contiguous layout
- Sorting algorithms: Merge sort, quick sort, and heap sort all operate on arrays as their primary data structure
Key Takeaways
- Arrays provide $O(1)$ random access: no other data structure matches this for indexed lookups
- Contiguous memory layout makes arrays cache-friendly, which matters more than Big-O in practice for sequential workloads
- Dynamic arrays achieve amortized $O(1)$ append by doubling capacity on overflow
- The trade-off: insertion and deletion in the middle cost $O(n)$ due to element shifting
- Arrays are the foundation for stacks, queues, heaps, hash tables, and most other data structures you'll encounter in this curriculum