Want Help Cracking FAANG?

(Then click this)

×
Back to Question Bank

352. Data Stream as Disjoint Intervals - Leetcode Solution

Problem Description

You are given a data stream of non-negative integers arriving one at a time. You need to design a data structure that supports two operations:

  • addNum(val): Add a new integer val from the stream to the data structure.
  • getIntervals(): Return the current set of disjoint intervals covering all the numbers seen so far. Each interval should be represented as a pair [start, end] and the intervals should be sorted by their start values.

Key constraints:

  • Each number in the stream is in the range [0, 10^4].
  • Intervals must be disjoint and sorted.
  • Duplicate numbers in the stream should not affect the intervals.
  • Efficiently merge adjacent or overlapping intervals when new numbers are added.

Thought Process

At first glance, the problem seems to require simply storing all numbers and reporting their minimum and maximum as intervals. However, since numbers can arrive in any order and there may be gaps, we need to track potentially many intervals.

A brute-force approach would be to store all numbers, sort them, and scan through to build intervals on each getIntervals() call. But this is inefficient, especially if getIntervals() is called frequently.

Instead, we want to maintain the intervals as we add numbers, so that getIntervals() can be answered quickly. This means that each addNum() must efficiently update the current set of intervals, merging or extending them if necessary.

Solution Approach

To solve this efficiently, we maintain a sorted list of intervals. Each interval is a pair [start, end] representing a contiguous range of numbers seen so far. When a new number is added:

  1. Find the correct position: Use binary search (or equivalent) to find where the new number fits relative to existing intervals.
  2. Check for merging:
    • If the number is already covered by an interval, do nothing.
    • If it is adjacent to one or two intervals (e.g., right after the end of one, or right before the start of another), merge the intervals or extend as needed.
    • If it is isolated, insert a new interval [val, val].
  3. Efficient Updates: By keeping the intervals sorted, insertion and merging can be done in logarithmic time with respect to the number of intervals.

For implementation, we can use:

  • A list or array for intervals, using binary search to find positions (Python's bisect module is handy).
  • In C++ and Java, TreeMap or similar structures can provide efficient ordered access.
  • In JavaScript, arrays with manual binary search are used.

Example Walkthrough

Let's walk through an example with the sequence: addNum(1), addNum(3), addNum(7), addNum(2), addNum(6).

  1. After addNum(1): intervals = [[1,1]]
  2. After addNum(3): intervals = [[1,1], [3,3]]
  3. After addNum(7): intervals = [[1,1], [3,3], [7,7]]
  4. After addNum(2):
    • 2 is adjacent to [1,1] and [3,3], so merge all three into [1,3]
    • intervals = [[1,3], [7,7]]
  5. After addNum(6):
    • 6 is adjacent to [7,7], so merge into [6,7]
    • intervals = [[1,3], [6,7]]

Calling getIntervals() now returns [[1,3], [6,7]].

Code Implementation

from bisect import bisect_left

class SummaryRanges:
    def __init__(self):
        self.intervals = []

    def addNum(self, val: int) -> None:
        intervals = self.intervals
        left = 0
        right = len(intervals)
        while left < right:
            mid = (left + right) // 2
            if intervals[mid][0] < val:
                left = mid + 1
            else:
                right = mid
        i = left
        # Merge with left
        if i > 0 and intervals[i-1][1] + 1 >= val:
            i -= 1
        else:
            intervals.insert(i, [val, val])
        # Merge intervals
        while i+1 < len(intervals) and intervals[i][1] + 1 >= intervals[i+1][0]:
            intervals[i][1] = max(intervals[i][1], intervals[i+1][1])
            del intervals[i+1]

    def getIntervals(self) -> list:
        return self.intervals
      
class SummaryRanges {
public:
    vector<vector<int>> intervals;
    SummaryRanges() {
    }
    void addNum(int val) {
        int n = intervals.size();
        int i = 0;
        while (i < n && intervals[i][1] + 1 < val) i++;
        if (i == n || intervals[i][0] - 1 > val) {
            intervals.insert(intervals.begin() + i, {val, val});
        } else {
            intervals[i][0] = min(intervals[i][0], val);
            intervals[i][1] = max(intervals[i][1], val);
        }
        // Merge with next intervals if necessary
        while (i + 1 < intervals.size() && intervals[i][1] + 1 >= intervals[i+1][0]) {
            intervals[i][1] = max(intervals[i][1], intervals[i+1][1]);
            intervals.erase(intervals.begin() + i + 1);
        }
    }
    vector<vector<int>> getIntervals() {
        return intervals;
    }
};
      
class SummaryRanges {
    private List<int[]> intervals;

    public SummaryRanges() {
        intervals = new ArrayList<>();
    }

    public void addNum(int val) {
        int n = intervals.size();
        int i = 0;
        while (i < n && intervals.get(i)[1] + 1 < val) i++;
        if (i == n || intervals.get(i)[0] - 1 > val) {
            intervals.add(i, new int[]{val, val});
        } else {
            intervals.get(i)[0] = Math.min(intervals.get(i)[0], val);
            intervals.get(i)[1] = Math.max(intervals.get(i)[1], val);
        }
        // Merge with next intervals if necessary
        while (i + 1 < intervals.size() && intervals.get(i)[1] + 1 >= intervals.get(i+1)[0]) {
            intervals.get(i)[1] = Math.max(intervals.get(i)[1], intervals.get(i+1)[1]);
            intervals.remove(i+1);
        }
    }

    public int[][] getIntervals() {
        return intervals.toArray(new int[intervals.size()][]);
    }
}
      
class SummaryRanges {
    constructor() {
        this.intervals = [];
    }

    addNum(val) {
        let intervals = this.intervals;
        let i = 0;
        while (i < intervals.length && intervals[i][1] + 1 < val) i++;
        if (i === intervals.length || intervals[i][0] - 1 > val) {
            intervals.splice(i, 0, [val, val]);
        } else {
            intervals[i][0] = Math.min(intervals[i][0], val);
            intervals[i][1] = Math.max(intervals[i][1], val);
        }
        // Merge with next intervals if necessary
        while (i + 1 < intervals.length && intervals[i][1] + 1 >= intervals[i+1][0]) {
            intervals[i][1] = Math.max(intervals[i][1], intervals[i+1][1]);
            intervals.splice(i+1, 1);
        }
    }

    getIntervals() {
        return this.intervals;
    }
}
      

Time and Space Complexity

Brute-force approach:

  • Time: O(N \log N) per getIntervals() call (for sorting and scanning all numbers), where N is the number of elements seen so far.
  • Space: O(N) for storing all numbers.
Optimized approach (as above):
  • Time: Each addNum() is O(K) in the worst case (where K is the number of intervals), but usually much faster due to merging and early breaks. With binary search, this can be reduced to O(\log K) for finding the insert position, plus a small number of merges.
  • Space: O(K), where K is the number of intervals (which is at most N, but usually much smaller).

The optimized approach is efficient for frequent updates and queries, and leverages the fact that the number of intervals is usually much less than the number of numbers added.

Summary

The "Data Stream as Disjoint Intervals" problem is solved efficiently by maintaining a sorted list of intervals and merging them as new numbers are added. This approach ensures that both adding numbers and retrieving the current intervals are fast and use minimal memory. By carefully merging adjacent or overlapping intervals during insertion, we avoid expensive recomputation and provide a clean, elegant solution.