Interval Merging and Sweep Line
Master interval problems including merging overlapping intervals, meeting rooms scheduling, insert interval, and the event-based sweep line technique.
Terminology
| Term | Definition |
|---|---|
| Interval | A pair $[start, end]$ representing a continuous range. Two intervals overlap if one starts before the other ends. |
| Overlapping intervals | Intervals $[a, b]$ and $[c, d]$ overlap when $a \leq d$ and $c \leq b$. They can be merged into $[\min(a,c), \max(b,d)]$. |
| Sweep line | An algorithmic technique that processes events (interval starts and ends) in sorted order, maintaining a running state as the "line" sweeps across the timeline. |
| Event point | A timestamp where something changes: an interval starts (+1 active) or ends (-1 active). Sweep line processes these in order. |
| Active count | The number of intervals that are "open" at a given point in time. The maximum active count gives the peak overlap (e.g., minimum meeting rooms needed). |
| Merge intervals | Combining all overlapping intervals into non-overlapping intervals. Sort by start time, then greedily extend or start new intervals. |
| Insert interval | Adding a new interval to a sorted, non-overlapping list and merging any resulting overlaps. A common variation of the merge pattern. |
What & Why
Interval problems show up constantly in scheduling, calendar applications, and resource allocation. The core challenge is handling overlaps efficiently.
The two main techniques are:
- Sort and merge: sort intervals by start time, then walk through them, merging overlapping intervals greedily. This handles "merge overlapping intervals" and "insert interval" in
O(n \log n). - Sweep line: convert each interval into two events (start and end), sort all events, then sweep through them maintaining a counter. This handles "minimum meeting rooms" and "maximum overlap" problems.
Both techniques rely on sorting as the first step. Once sorted, a single linear pass resolves the problem. The pattern is recognizable: whenever you see a list of intervals and need to reason about overlaps, conflicts, or coverage, sort-and-scan or sweep line will likely work.
How It Works
Merge Overlapping Intervals
Sort by start time. Walk through the sorted list. If the current interval overlaps with the last merged interval, extend the end. Otherwise, start a new merged interval.
Sweep Line: Meeting Rooms (Minimum Rooms Needed)
Create +1 events for each meeting start and -1 events for each meeting end. Sort events by time (break ties: ends before starts). Sweep through, tracking the running count. The peak is the answer.
Complexity Analysis
| Problem | Time | Space |
|---|---|---|
| Merge intervals | $O(n \log n)$ | $O(n)$ |
| Insert interval | $O(n)$ | $O(n)$ |
| Meeting rooms (min rooms) | $O(n \log n)$ | $O(n)$ |
| Maximum overlap count | $O(n \log n)$ | $O(n)$ |
Sorting dominates at $O(n \log n)$. The subsequent linear scan is $O(n)$. Total:
Implementation
Merge Overlapping Intervals
ALGORITHM MergeIntervals(intervals, n)
INPUT: array of n intervals [start, end]
OUTPUT: merged non-overlapping intervals
SORT intervals by start time ascending
merged = [intervals[0]]
FOR i = 1 TO n - 1 DO
last = merged[LENGTH(merged) - 1]
IF intervals[i].start <= last.end THEN
last.end = MAX(last.end, intervals[i].end)
ELSE
merged.append(intervals[i])
END IF
END FOR
RETURN merged
END
Sweep Line: Minimum Meeting Rooms
ALGORITHM MinMeetingRooms(meetings, n)
INPUT: array of n meetings [start, end]
OUTPUT: minimum number of rooms needed
events = empty list
FOR EACH meeting IN meetings DO
events.append((meeting.start, +1))
events.append((meeting.end, -1))
END FOR
SORT events by time ascending
// tie-break: -1 (end) before +1 (start)
active = 0
maxActive = 0
FOR EACH (time, delta) IN events DO
active = active + delta
maxActive = MAX(maxActive, active)
END FOR
RETURN maxActive
END
Insert Interval
ALGORITHM InsertInterval(intervals, newInterval)
INPUT: sorted non-overlapping intervals, newInterval [start, end]
OUTPUT: merged intervals after insertion
result = empty list
i = 0
n = LENGTH(intervals)
// Add all intervals that end before newInterval starts
WHILE i < n AND intervals[i].end < newInterval.start DO
result.append(intervals[i])
i = i + 1
END WHILE
// Merge overlapping intervals
WHILE i < n AND intervals[i].start <= newInterval.end DO
newInterval.start = MIN(newInterval.start, intervals[i].start)
newInterval.end = MAX(newInterval.end, intervals[i].end)
i = i + 1
END WHILE
result.append(newInterval)
// Add remaining intervals
WHILE i < n DO
result.append(intervals[i])
i = i + 1
END WHILE
RETURN result
END
Real-World Applications
- Calendar scheduling: Google Calendar and Outlook detect conflicts and find free slots by merging and scanning intervals.
- Resource allocation: cloud providers determine peak VM usage by sweep-lining reservation intervals to plan capacity.
- Network bandwidth planning: overlapping data transfer windows are merged to compute peak bandwidth requirements.
- Genomics: merging overlapping gene regions and computing coverage depth across a chromosome uses interval merge and sweep line.
- Computational geometry: sweep line algorithms solve line segment intersection, closest pair, and Voronoi diagram problems.
Key Takeaways
- Sort intervals by start time first. This single step enables a greedy linear scan for most interval problems
- Merge intervals: extend the current interval if overlap exists, otherwise start a new one. The overlap condition is $\text{current.start} \leq \text{last.end}$
- Sweep line converts intervals into +1/-1 events, sorts them, and tracks a running count. The peak count answers "maximum overlap" questions
- Insert interval is a three-phase scan: add non-overlapping intervals before, merge overlapping ones, add non-overlapping intervals after
- The $O(n \log n)$ sort dominates the cost. The scan itself is always $O(n)$