You are given a data stream of non-negative integers arriving one at a time. You need to design a data structure that supports two operations:
addNum(val)
: Add a new integer val
from the stream to the data structure.getIntervals()
: Return the current set of disjoint intervals covering all the numbers seen so far. Each interval should be represented as a pair [start, end]
and the intervals should be sorted by their start values.Key constraints:
[0, 10^4]
.At first glance, the problem seems to require simply storing all numbers and reporting their minimum and maximum as intervals. However, since numbers can arrive in any order and there may be gaps, we need to track potentially many intervals.
A brute-force approach would be to store all numbers, sort them, and scan through to build intervals on each getIntervals()
call. But this is inefficient, especially if getIntervals()
is called frequently.
Instead, we want to maintain the intervals as we add numbers, so that getIntervals()
can be answered quickly. This means that each addNum()
must efficiently update the current set of intervals, merging or extending them if necessary.
To solve this efficiently, we maintain a sorted list of intervals. Each interval is a pair [start, end]
representing a contiguous range of numbers seen so far. When a new number is added:
[val, val]
.For implementation, we can use:
bisect
module is handy).
Let's walk through an example with the sequence: addNum(1)
, addNum(3)
, addNum(7)
, addNum(2)
, addNum(6)
.
addNum(1)
: intervals = [[1,1]]addNum(3)
: intervals = [[1,1], [3,3]]addNum(7)
: intervals = [[1,1], [3,3], [7,7]]addNum(2)
:
addNum(6)
:
Calling getIntervals()
now returns [[1,3], [6,7]]
.
from bisect import bisect_left
class SummaryRanges:
def __init__(self):
self.intervals = []
def addNum(self, val: int) -> None:
intervals = self.intervals
left = 0
right = len(intervals)
while left < right:
mid = (left + right) // 2
if intervals[mid][0] < val:
left = mid + 1
else:
right = mid
i = left
# Merge with left
if i > 0 and intervals[i-1][1] + 1 >= val:
i -= 1
else:
intervals.insert(i, [val, val])
# Merge intervals
while i+1 < len(intervals) and intervals[i][1] + 1 >= intervals[i+1][0]:
intervals[i][1] = max(intervals[i][1], intervals[i+1][1])
del intervals[i+1]
def getIntervals(self) -> list:
return self.intervals
class SummaryRanges {
public:
vector<vector<int>> intervals;
SummaryRanges() {
}
void addNum(int val) {
int n = intervals.size();
int i = 0;
while (i < n && intervals[i][1] + 1 < val) i++;
if (i == n || intervals[i][0] - 1 > val) {
intervals.insert(intervals.begin() + i, {val, val});
} else {
intervals[i][0] = min(intervals[i][0], val);
intervals[i][1] = max(intervals[i][1], val);
}
// Merge with next intervals if necessary
while (i + 1 < intervals.size() && intervals[i][1] + 1 >= intervals[i+1][0]) {
intervals[i][1] = max(intervals[i][1], intervals[i+1][1]);
intervals.erase(intervals.begin() + i + 1);
}
}
vector<vector<int>> getIntervals() {
return intervals;
}
};
class SummaryRanges {
private List<int[]> intervals;
public SummaryRanges() {
intervals = new ArrayList<>();
}
public void addNum(int val) {
int n = intervals.size();
int i = 0;
while (i < n && intervals.get(i)[1] + 1 < val) i++;
if (i == n || intervals.get(i)[0] - 1 > val) {
intervals.add(i, new int[]{val, val});
} else {
intervals.get(i)[0] = Math.min(intervals.get(i)[0], val);
intervals.get(i)[1] = Math.max(intervals.get(i)[1], val);
}
// Merge with next intervals if necessary
while (i + 1 < intervals.size() && intervals.get(i)[1] + 1 >= intervals.get(i+1)[0]) {
intervals.get(i)[1] = Math.max(intervals.get(i)[1], intervals.get(i+1)[1]);
intervals.remove(i+1);
}
}
public int[][] getIntervals() {
return intervals.toArray(new int[intervals.size()][]);
}
}
class SummaryRanges {
constructor() {
this.intervals = [];
}
addNum(val) {
let intervals = this.intervals;
let i = 0;
while (i < intervals.length && intervals[i][1] + 1 < val) i++;
if (i === intervals.length || intervals[i][0] - 1 > val) {
intervals.splice(i, 0, [val, val]);
} else {
intervals[i][0] = Math.min(intervals[i][0], val);
intervals[i][1] = Math.max(intervals[i][1], val);
}
// Merge with next intervals if necessary
while (i + 1 < intervals.length && intervals[i][1] + 1 >= intervals[i+1][0]) {
intervals[i][1] = Math.max(intervals[i][1], intervals[i+1][1]);
intervals.splice(i+1, 1);
}
}
getIntervals() {
return this.intervals;
}
}
Brute-force approach:
O(N \log N)
per getIntervals()
call (for sorting and scanning all numbers), where N
is the number of elements seen so far.O(N)
for storing all numbers.addNum()
is O(K)
in the worst case (where K
is the number of intervals), but usually much faster due to merging and early breaks. With binary search, this can be reduced to O(\log K)
for finding the insert position, plus a small number of merges.O(K)
, where K
is the number of intervals (which is at most N
, but usually much smaller).The optimized approach is efficient for frequent updates and queries, and leverages the fact that the number of intervals is usually much less than the number of numbers added.
The "Data Stream as Disjoint Intervals" problem is solved efficiently by maintaining a sorted list of intervals and merging them as new numbers are added. This approach ensures that both adding numbers and retrieving the current intervals are fast and use minimal memory. By carefully merging adjacent or overlapping intervals during insertion, we avoid expensive recomputation and provide a clean, elegant solution.