Graphs: Representation, BFS, and DFS
How to represent graphs with adjacency lists and matrices, and how breadth-first and depth-first search explore them systematically.
Terminology
- Graph
- A data structure consisting of a set of vertices (nodes) $V$ and a set of edges $E$ connecting pairs of vertices. Written as $G = (V, E)$.
- Vertex (Node)
- A fundamental unit in a graph, representing an entity. Vertices are connected by edges.
- Edge
- A connection between two vertices. In a directed graph, an edge has a direction (from $u$ to $v$). In an undirected graph, the edge connects both ways.
- Directed Graph (Digraph)
- A graph where edges have a direction: edge $(u, v)$ goes from $u$ to $v$ but not necessarily from $v$ to $u$.
- Undirected Graph
- A graph where edges have no direction: edge $\{u, v\}$ connects $u$ and $v$ symmetrically.
- Adjacency List
- A graph representation where each vertex stores a list of its neighbors. Space-efficient for sparse graphs: $O(V + E)$.
- Adjacency Matrix
- A graph representation using a $V \times V$ matrix where entry $(i, j)$ indicates whether an edge exists between vertices $i$ and $j$. Uses $O(V^2)$ space.
- BFS (Breadth-First Search)
- A traversal algorithm that explores all neighbors at the current depth before moving to the next depth level. Uses a queue.
- DFS (Depth-First Search)
- A traversal algorithm that explores as far as possible along each branch before backtracking. Uses a stack (or recursion).
- Connected Component
- A maximal set of vertices such that there is a path between every pair. An undirected graph may have multiple connected components.
What & Why
A graph is the most general-purpose data structure for representing relationships between entities. Unlike trees (which are hierarchical and acyclic), graphs can have cycles, multiple paths between nodes, and no designated root. This flexibility makes them essential for modeling networks, dependencies, maps, social connections, and countless other domains.
Why does this matter? Graphs appear everywhere in computing. Social networks model friendships as edges. GPS navigation finds shortest paths through road graphs. Package managers resolve dependency graphs. Web crawlers traverse the link graph of the internet. Understanding graph representation and traversal is foundational to solving these problems.
Key properties:
- Flexible structure: can represent any pairwise relationship, with or without direction and weights
- Two standard representations: adjacency lists (space-efficient for sparse graphs) and adjacency matrices (fast edge lookup)
- BFS and DFS: the two fundamental traversal strategies, each with distinct properties and use cases
- Foundation for advanced algorithms: shortest path, minimum spanning tree, topological sort, and network flow all build on graph basics
How It Works
Graph Representations
Consider this undirected graph with 5 vertices and 6 edges:
Adjacency List representation:
0: [1, 2]
1: [0, 2, 3]
2: [0, 1, 4]
3: [1, 4]
4: [2, 3]
Each vertex maps to a list of its neighbors. Total storage: $O(V + E)$. Checking if edge $(u, v)$ exists requires scanning $u$'s neighbor list: $O(\text{degree}(u))$.
Adjacency Matrix representation:
0 1 2 3 4
0 [ 0, 1, 1, 0, 0 ]
1 [ 1, 0, 1, 1, 0 ]
2 [ 1, 1, 0, 0, 1 ]
3 [ 0, 1, 0, 0, 1 ]
4 [ 0, 0, 1, 1, 0 ]
A 1 at position $(i, j)$ means an edge exists. Edge lookup is $O(1)$, but storage is $O(V^2)$ regardless of edge count.
When to Use Which
| Criteria | Adjacency List | Adjacency Matrix |
|---|---|---|
| Space | $O(V + E)$ | $O(V^2)$ |
| Edge lookup | $O(\text{deg}(u))$ | $O(1)$ |
| Iterate neighbors | $O(\text{deg}(u))$ | $O(V)$ |
| Best for | Sparse graphs ($E \ll V^2$) | Dense graphs ($E \approx V^2$) |
BFS (Breadth-First Search)
BFS explores the graph level by level, visiting all vertices at distance $d$ from the source before any vertex at distance $d + 1$. It uses a queue to maintain the frontier.
Starting BFS from vertex 0 on the graph above visits: 0, 1, 2, 3, 4.
BFS naturally finds the shortest path (fewest edges) from the source to every reachable vertex in an unweighted graph.
DFS (Depth-First Search)
DFS explores as deep as possible along each branch before backtracking. It uses a stack (explicit or via recursion) to track the current path.
Starting DFS from vertex 0 might visit: 0, 1, 2, 4, 3 (the exact order depends on neighbor iteration order).
DFS is useful for detecting cycles, topological sorting, finding connected components, and solving maze-like problems.
Complexity Analysis
| Algorithm | Time | Space | Notes |
|---|---|---|---|
| BFS | $O(V + E)$ | $O(V)$ | Queue + visited set |
| DFS | $O(V + E)$ | $O(V)$ | Stack + visited set |
| BFS shortest path | $O(V + E)$ | $O(V)$ | Unweighted graphs only |
| Connected components | $O(V + E)$ | $O(V)$ | Run BFS/DFS from each unvisited vertex |
Both BFS and DFS visit every vertex and examine every edge exactly once (in an adjacency list representation), giving $O(V + E)$ time. The space is $O(V)$ for the visited set and the queue/stack.
Implementation
Graph Construction (Pseudocode)
FUNCTION buildGraph(edges, directed):
graph ← empty adjacency list (map of vertex → list of neighbors)
FOR EACH (u, v) IN edges:
ADD v TO graph[u]
IF NOT directed THEN ADD u TO graph[v]
RETURN graph
BFS (Pseudocode)
FUNCTION bfs(graph, start):
visited ← set containing {start}
queue ← empty queue
order ← empty list
ENQUEUE start INTO queue
WHILE queue IS NOT EMPTY:
node ← DEQUEUE from queue
APPEND node TO order
FOR EACH neighbor IN graph[node]:
IF neighbor NOT IN visited THEN
ADD neighbor TO visited
ENQUEUE neighbor INTO queue
RETURN order
DFS (Pseudocode)
FUNCTION dfs(graph, start):
visited ← empty set
order ← empty list
explore(graph, start, visited, order)
RETURN order
FUNCTION explore(graph, node, visited, order):
ADD node TO visited
APPEND node TO order
FOR EACH neighbor IN graph[node]:
IF neighbor NOT IN visited THEN
explore(graph, neighbor, visited, order)
BFS Shortest Path (Pseudocode)
FUNCTION bfsShortestPath(graph, start, end):
visited ← set containing {start}
queue ← empty queue
parent ← empty map
ENQUEUE start INTO queue
WHILE queue IS NOT EMPTY:
node ← DEQUEUE from queue
IF node = end THEN
path ← empty list
current ← end
WHILE current IS DEFINED:
PREPEND current TO path
current ← parent[current]
RETURN path
FOR EACH neighbor IN graph[node]:
IF neighbor NOT IN visited THEN
ADD neighbor TO visited
parent[neighbor] ← node
ENQUEUE neighbor INTO queue
RETURN NULL // no path found
Real-World Applications
- Social networks: users are vertices, friendships are edges; BFS finds degrees of separation between people
- GPS navigation: road networks are weighted graphs; shortest-path algorithms (Dijkstra, A*) find optimal routes
- Web crawling: the internet is a directed graph of pages linked by hyperlinks; crawlers use BFS to discover pages level by level
- Package managers: npm, pip, and cargo model dependencies as directed acyclic graphs; topological sort determines installation order
- Network routing: routers use graph algorithms (OSPF uses Dijkstra, BGP uses path-vector) to find optimal packet routes
- Garbage collection: mark-and-sweep GC uses DFS/BFS from root references to find reachable objects in the object graph
Key Takeaways
- Graphs are the most general relationship data structure: vertices connected by edges, with optional direction and weights
- Adjacency lists use $O(V + E)$ space and are preferred for sparse graphs; adjacency matrices use $O(V^2)$ but offer $O(1)$ edge lookup
- BFS explores level by level using a queue and finds shortest paths in unweighted graphs
- DFS explores depth-first using a stack (or recursion) and is the basis for cycle detection, topological sort, and connected components
- Both BFS and DFS run in $O(V + E)$ time, visiting every vertex and edge exactly once