Notes/Linked Lists: Pointers All the Way Down

Linked Lists: Pointers All the Way Down

Singly, doubly, and circular linked lists: how pointer-based data structures trade random access for efficient insertion and deletion.

2021-01-03AI-Synthesized from Personal Notes

CS FundamentalsLinked ListsData Structures

What & Why

A linked list is a linear data structure where each element (called a node) stores its value and a pointer (or reference) to the next node. Unlike arrays, linked list nodes are scattered across memory. They don't need contiguous allocation.

Why does this matter? Linked lists solve the biggest weakness of arrays: insertion and deletion in the middle. With an array, inserting at position $i$ requires shifting every element after it, costing $O(n)$ work. With a linked list, once you have a reference to the right node, insertion is $O(1)$: just rewire two pointers.

The trade-off is real though. You lose $O(1)$ random access. To reach the $k$-th element, you must walk $k$ pointers from the head. This makes linked lists a poor choice when you need frequent indexed lookups, but an excellent choice when your workload is dominated by insertions, deletions, and sequential traversal.

Key properties:

Dynamic size: no pre-allocation or resizing needed, nodes are allocated individually
O(1) insertion/deletion: at a known position, just rewire pointers
O(n) access: reaching element $k$ requires traversing $k$ nodes
No wasted space: memory usage is proportional to the number of elements (plus pointer overhead)

How It Works

Singly Linked List

Each node holds a value and a next pointer. The list is accessed through a head pointer. The last node's next is null.

Doubly Linked List

Each node has both a next and a prev pointer, enabling traversal in both directions. This costs an extra pointer per node but lets you delete a node in $O(1)$ when you already have a reference to it. No need to find the previous node first.

Circular Linked List

In a circular variant, the last node's next points back to the head (and in a doubly-circular list, the head's prev points to the tail). There is no null terminator, so traversal wraps around endlessly. This is useful for round-robin scheduling, circular buffers, and any scenario where you need to cycle through elements continuously.

Insertion at a Known Position

Inserting a new node after a given node prev in a singly linked list takes three steps:

Allocate a new node with the desired value
Set newNode.next = prev.next
Set prev.next = newNode

No elements are shifted. No memory is copied. Just two pointer assignments, $O(1)$.

Deletion at a Known Position

To delete the node after prev in a singly linked list:

Set prev.next = prev.next.next
The removed node is now unreachable and can be garbage collected

In a doubly linked list, you can delete a node directly (without needing its predecessor) because you have the prev pointer.

Complexity Analysis

Operation	Singly	Doubly	Circular	Notes
Access by index	$O(n)$	$O(n)$	$O(n)$	Must traverse from head (or tail)
Search	$O(n)$	$O(n)$	$O(n)$	Linear scan; circular must detect full loop
Insert at head	$O(1)$	$O(1)$	$O(1)$*	*Circular must also update tail's next
Insert at tail	$O(n)$ / $O(1)$*	$O(1)$	$O(1)$*	*$O(1)$ if tail pointer maintained
Insert after node	$O(1)$	$O(1)$	$O(1)$	Given a reference to the node
Delete known node	$O(n)$	$O(1)$	$O(n)$ / $O(1)$*	*$O(1)$ if doubly-circular
Delete at head	$O(1)$	$O(1)$	$O(1)$*	*Circular must also update tail's next
Full traversal	$O(n)$	$O(n)$	$O(n)$	Circular: stop when you return to head

Memory Overhead

Each node in a singly linked list stores one pointer alongside its data. For a doubly linked list, it's two pointers. On a 64-bit system, each pointer is 8 bytes:

$\text{Overhead}_{\text{singly}} = n \times 8 \text{ bytes}$

$\text{Overhead}_{\text{doubly}} = n \times 16 \text{ bytes}$

For small data types (like integers), the pointer overhead can exceed the data itself. This is why arrays are preferred when random access matters and the dataset fits in contiguous memory.

Cache Performance

Linked list nodes are heap-allocated and scattered in memory. This means traversal suffers from poor spatial locality: each node.next dereference is likely a cache miss. Arrays, by contrast, benefit from hardware prefetching because elements are contiguous. In practice, this cache penalty often dominates the theoretical $O(1)$ insertion advantage for small lists.

Implementation

The following pseudocode is language-agnostic. A singly linked list node stores a value and a next pointer; a doubly linked list node adds a prev pointer.

Node Structure

SinglyNode:
    value
    next → SinglyNode or NULL

DoublyNode:
    value
    prev → DoublyNode or NULL
    next → DoublyNode or NULL

Singly Linked List: Core Operations

FUNCTION pushFront(list, value):
    node ← new SinglyNode(value)
    node.next ← list.head
    list.head ← node
    IF list.tail = NULL THEN list.tail ← node
    list.size ← list.size + 1

FUNCTION pushBack(list, value):
    node ← new SinglyNode(value)
    IF list.tail ≠ NULL THEN
        list.tail.next ← node
    ELSE
        list.head ← node
    list.tail ← node
    list.size ← list.size + 1

FUNCTION popFront(list):
    IF list.head = NULL THEN RETURN NULL
    value ← list.head.value
    list.head ← list.head.next
    IF list.head = NULL THEN list.tail ← NULL
    list.size ← list.size - 1
    RETURN value

FUNCTION get(list, index):
    current ← list.head
    FOR i ← 0 TO index - 1:
        IF current = NULL THEN RETURN NULL
        current ← current.next
    RETURN current.value

Doubly Linked List: Delete by Reference

FUNCTION remove(list, node):
    IF node.prev ≠ NULL THEN
        node.prev.next ← node.next
    ELSE
        list.head ← node.next

    IF node.next ≠ NULL THEN
        node.next.prev ← node.prev
    ELSE
        list.tail ← node.prev

    list.size ← list.size - 1
    RETURN node.value

The key insight: in a doubly linked list, remove is $O(1)$ because the node already carries references to its neighbors. In a singly linked list, you'd need to traverse from the head to find the predecessor, which is $O(n)$.

Real-World Applications

LRU Cache: The classic LRU cache combines a hash map with a doubly linked list. The map gives $O(1)$ lookup, and the list gives $O(1)$ eviction and promotion. This is how Redis, Memcached, and browser caches work.
Browser history: Back/forward navigation is a doubly linked list of visited pages. Each page points to the previous and next page in the history.
OS task scheduling: Linux's CFS (Completely Fair Scheduler) uses a linked structure for its run queue. Process control blocks are linked together for efficient insertion and removal.
Undo/redo systems: Text editors and design tools maintain a doubly linked list of states. Moving backward is "undo," forward is "redo."
Memory allocators: Free lists in malloc implementations use linked lists to track available memory blocks. The allocator walks the free list to find a suitable block.
Music playlists: Circular doubly linked lists power "repeat all" and shuffle-with-history in media players.

Key Takeaways

Linked lists trade $O(1)$ random access for $O(1)$ insertion and deletion at known positions
Singly linked lists use one pointer per node; doubly linked lists use two but enable backward traversal and $O(1)$ deletion of any referenced node
Circular variants are useful when you need to cycle through elements (round-robin scheduling, circular buffers)
Poor cache locality is the hidden cost: scattered heap allocations mean frequent cache misses during traversal
The doubly linked list + hash map combo (LRU cache) is one of the most important patterns in systems programming

```

2021-01-05

Stacks: Last In, First Out

Understanding stacks, the LIFO data structure behind call stacks, expression evaluation, undo systems, and backtracking algorithms.

2021-01-07

Queues: First In, First Out

Understanding queues, the FIFO data structure behind task scheduling, breadth-first search, message passing, and buffering systems.

2021-01-09

Hash Tables: Constant-Time Lookup

How hash functions, collision resolution, and dynamic resizing combine to give hash tables their O(1) average-case performance.