BFS and DFS: The Two Ways to Explore a Graph
A deep dive into breadth-first search and depth-first search, their properties, trade-offs, and the problems each one solves best.
Terminology
- BFS (Breadth-First Search)
- A graph traversal that explores all vertices at distance $d$ from the source before any vertex at distance $d+1$. Uses a queue. Guarantees shortest path in unweighted graphs.
- DFS (Depth-First Search)
- A graph traversal that explores as far as possible along each branch before backtracking. Uses a stack (or recursion). Useful for cycle detection, topological sort, and connected components.
- Frontier
- The set of vertices discovered but not yet fully explored. In BFS this is the queue; in DFS this is the stack (or the current recursion path).
- Visited Set
- A set tracking which vertices have been discovered, preventing re-exploration and infinite loops in cyclic graphs.
- Shortest Path (Unweighted)
- The path with the fewest edges between two vertices. BFS finds this naturally because it explores level by level.
- Topological Sort
- A linear ordering of vertices in a directed acyclic graph (DAG) such that for every edge $(u, v)$, $u$ appears before $v$. DFS produces this by recording vertices in reverse finish order.
- Connected Component
- A maximal set of vertices where every pair is connected by a path. Both BFS and DFS can identify all connected components by running from each unvisited vertex.
- Back Edge
- An edge from a vertex to one of its ancestors in the DFS tree. The presence of a back edge indicates a cycle in the graph.
- Tree Edge
- An edge in the DFS or BFS tree that leads to a newly discovered vertex.
- Level (BFS)
- The set of all vertices at the same distance from the source. BFS processes vertices level by level.
What & Why
BFS and DFS are the two fundamental strategies for systematically visiting every vertex in a graph. They differ in one key choice: which vertex to explore next from the frontier.
BFS picks the vertex that was discovered earliest (FIFO, using a queue), so it fans out level by level. DFS picks the vertex that was discovered most recently (LIFO, using a stack), so it dives deep before backtracking.
Why does this matter? This single difference in exploration order gives each algorithm distinct properties that make them suited to different problems:
- BFS finds shortest paths in unweighted graphs, computes distances, and detects the minimum number of steps to reach a target
- DFS detects cycles, computes topological orderings, finds connected/strongly connected components, and solves backtracking problems
Every graph algorithm you'll encounter in this curriculum builds on one of these two traversals. Dijkstra's algorithm is a weighted generalization of BFS. Tarjan's SCC algorithm is built on DFS. Understanding both deeply is non-negotiable.
Key properties:
- Same time complexity: both run in
O(V + E)on adjacency lists - Different space profiles: BFS stores the entire frontier (can be wide); DFS stores the current path (can be deep)
- BFS = shortest path in unweighted graphs; DFS does not guarantee shortest paths
- DFS = cycle detection via back edges; BFS can detect cycles but less naturally
How It Works
BFS: Level-by-Level Exploration
BFS starts at a source vertex, visits all its neighbors (level 1), then all neighbors of those neighbors (level 2), and so on. It uses a queue to maintain the frontier.
At each step, BFS dequeues the front vertex, processes it, and enqueues all its unvisited neighbors. The queue ensures FIFO order, which is why vertices are processed level by level.
DFS: Dive Deep, Then Backtrack
DFS starts at a source vertex and immediately dives into the first neighbor, then that neighbor's first neighbor, and so on until it hits a dead end. Then it backtracks to the most recent vertex with unexplored neighbors and continues.
DFS can be implemented with an explicit stack or with recursion (which uses the call stack implicitly). The recursive version is more common because it's cleaner, but the iterative version avoids stack overflow on deep graphs.
BFS vs DFS: Side-by-Side Comparison
| Property | BFS | DFS |
|---|---|---|
| Data structure | Queue (FIFO) | Stack (LIFO) / recursion |
| Exploration order | Level by level | Branch by branch |
| Shortest path? | Yes (unweighted) | No |
| Space (worst case) | $O(V)$ (wide frontier) | $O(V)$ (deep path) |
| Cycle detection | Possible but awkward | Natural (back edges) |
| Topological sort | Kahn's algorithm (BFS-based) | Reverse postorder |
When to Use Which
- Use BFS when you need shortest paths, minimum steps, or level-order processing
- Use DFS when you need cycle detection, topological ordering, path existence, or backtracking
- For connected components, either works; DFS is slightly simpler to implement recursively
DFS Edge Classification
During DFS on a directed graph, every edge falls into one of four categories:
- Tree edge: leads to an unvisited vertex (part of the DFS tree)
- Back edge: leads to an ancestor in the DFS tree (indicates a cycle)
- Forward edge: leads to a descendant already visited via another path
- Cross edge: leads to a vertex in a different branch of the DFS tree
In undirected graphs, only tree edges and back edges exist. A back edge in an undirected graph means a cycle.
Complexity Analysis
| Algorithm | Time | Space | Notes |
|---|---|---|---|
| BFS | $O(V + E)$ | $O(V)$ | Queue can hold up to $O(V)$ vertices at widest level |
| DFS (recursive) | $O(V + E)$ | $O(V)$ | Call stack depth up to $O(V)$ for linear graphs |
| DFS (iterative) | $O(V + E)$ | $O(V)$ | Explicit stack avoids call stack overflow |
| BFS shortest path | $O(V + E)$ | $O(V)$ | Parent map adds $O(V)$ space |
| Find all components | $O(V + E)$ | $O(V)$ | Run BFS/DFS from each unvisited vertex |
Space Trade-off in Practice
BFS space depends on the maximum width of the graph (the largest level). In a balanced binary tree with $n$ nodes, the widest level has $\approx n/2$ nodes, so BFS uses $O(n)$ space. DFS on the same tree uses $O(\log n)$ space (the height).
Conversely, on a long chain graph (path graph), DFS uses $O(n)$ stack space while BFS uses $O(1)$ queue space (each level has exactly one node).
Implementation
BFS (Pseudocode)
FUNCTION bfs(graph, start):
visited ← set containing {start}
queue ← empty queue
ENQUEUE start INTO queue
WHILE queue IS NOT EMPTY:
node ← DEQUEUE from queue
PROCESS node
FOR EACH neighbor IN graph[node]:
IF neighbor NOT IN visited THEN
ADD neighbor TO visited
ENQUEUE neighbor INTO queue
BFS Shortest Path (Pseudocode)
FUNCTION bfsShortestPath(graph, start, target):
visited ← set containing {start}
queue ← empty queue
parent ← empty map
ENQUEUE start INTO queue
WHILE queue IS NOT EMPTY:
node ← DEQUEUE from queue
IF node = target THEN
RETURN reconstructPath(parent, start, target)
FOR EACH neighbor IN graph[node]:
IF neighbor NOT IN visited THEN
ADD neighbor TO visited
parent[neighbor] ← node
ENQUEUE neighbor INTO queue
RETURN NULL // no path exists
FUNCTION reconstructPath(parent, start, target):
path ← empty list
current ← target
WHILE current ≠ start:
PREPEND current TO path
current ← parent[current]
PREPEND start TO path
RETURN path
DFS Recursive (Pseudocode)
FUNCTION dfs(graph, start):
visited ← empty set
dfsExplore(graph, start, visited)
FUNCTION dfsExplore(graph, node, visited):
ADD node TO visited
PROCESS node
FOR EACH neighbor IN graph[node]:
IF neighbor NOT IN visited THEN
dfsExplore(graph, neighbor, visited)
DFS Iterative (Pseudocode)
FUNCTION dfsIterative(graph, start):
visited ← empty set
stack ← empty stack
PUSH start ONTO stack
WHILE stack IS NOT EMPTY:
node ← POP from stack
IF node IN visited THEN CONTINUE
ADD node TO visited
PROCESS node
FOR EACH neighbor IN graph[node]:
IF neighbor NOT IN visited THEN
PUSH neighbor ONTO stack
Cycle Detection with DFS (Pseudocode)
FUNCTION hasCycle(graph):
visited ← empty set
inStack ← empty set // tracks current DFS path
FOR EACH vertex IN graph:
IF vertex NOT IN visited THEN
IF dfsCycle(graph, vertex, visited, inStack) THEN
RETURN true
RETURN false
FUNCTION dfsCycle(graph, node, visited, inStack):
ADD node TO visited
ADD node TO inStack
FOR EACH neighbor IN graph[node]:
IF neighbor NOT IN visited THEN
IF dfsCycle(graph, neighbor, visited, inStack) THEN
RETURN true
ELSE IF neighbor IN inStack THEN
RETURN true // back edge found = cycle
REMOVE node FROM inStack
RETURN false
Connected Components (Pseudocode)
FUNCTION findComponents(graph):
visited ← empty set
components ← empty list
FOR EACH vertex IN graph:
IF vertex NOT IN visited THEN
component ← empty list
bfsCollect(graph, vertex, visited, component)
APPEND component TO components
RETURN components
FUNCTION bfsCollect(graph, start, visited, component):
ADD start TO visited
queue ← empty queue
ENQUEUE start INTO queue
WHILE queue IS NOT EMPTY:
node ← DEQUEUE from queue
APPEND node TO component
FOR EACH neighbor IN graph[node]:
IF neighbor NOT IN visited THEN
ADD neighbor TO visited
ENQUEUE neighbor INTO queue
Real-World Applications
- Shortest path in unweighted graphs: BFS finds the minimum number of hops between two nodes, used in social network "degrees of separation," word ladder puzzles, and network hop counting
- Web crawling: search engines use BFS to discover pages level by level from seed URLs, ensuring broad coverage before deep crawling
- Cycle detection in dependencies: package managers (npm, pip) use DFS to detect circular dependencies before installation
- Topological sorting: build systems (Make, Gradle) use DFS-based topological sort to determine compilation order
- Maze solving: BFS finds the shortest path through a maze; DFS finds any path (useful for maze generation)
- Garbage collection: mark-and-sweep GC uses DFS/BFS from root references to identify reachable objects
- Network broadcast: BFS models how information spreads through a network, level by level from the source
- Strongly connected components: Tarjan's and Kosaraju's algorithms use DFS to decompose directed graphs into SCCs
Key Takeaways
- BFS and DFS both visit every vertex and edge in $O(V + E)$ time, but their exploration order differs fundamentally
- BFS uses a queue and explores level by level, making it the right choice for shortest paths in unweighted graphs
- DFS uses a stack (or recursion) and dives deep before backtracking, making it natural for cycle detection, topological sort, and backtracking problems
- BFS space is proportional to the widest level; DFS space is proportional to the maximum depth
- DFS classifies edges into tree, back, forward, and cross edges; back edges indicate cycles