The Kth Largest Element in a Stream problem asks you to design a class that efficiently tracks the k
th largest element in a dynamically growing sequence of integers. You are given two parameters: an integer k
, and an initial integer array nums
. The class should support an add(val)
method, which adds a new integer val
to the stream and returns the k
th largest element among all elements seen so far (including the new one).
k
and nums
.add
method can be called multiple times; each call must return the current k
th largest element.nums
+ number of add
callsk
elements after each add
operation
At first glance, you might think to keep all numbers in a list and sort it after every add
operation, then pick the k
th largest value. However, sorting the entire list every time is inefficient, especially as the stream grows. We need a data structure that helps us efficiently keep track of the k
largest elements without sorting everything each time.
The key insight is that we only care about the k
largest elements at any time. If we could always quickly access the smallest of those k
largest, we’d know the k
th largest overall. A min-heap is perfect for this: it allows us to efficiently maintain the k
largest elements, with the smallest of them at the top.
We'll use a min-heap (priority queue) of size k
to keep track of the k
largest elements seen so far. Here’s how:
nums
into the min-heap.k
, remove the smallest element (pop from heap) until only k
elements remain.add(val)
is called, insert val
into the heap.k
, remove the smallest element.k
th largest element in the stream.
Why a min-heap? Because it lets us efficiently discard elements smaller than the current k
largest, and always gives us the k
th largest at the top.
Suppose k = 3
and nums = [4, 5, 8, 2]
. We want to track the 3rd largest element at each step.
At each step, the top of the min-heap gives us the k
th largest element.
add
: O(N log N) per operation, where N is the number of elements so far.add
operation: O(log k) (heap insert and possibly pop)
The optimized approach is much faster and uses less memory, especially when k
is much smaller than the total number of elements.
import heapq
class KthLargest:
def __init__(self, k: int, nums: list[int]):
self.k = k
self.heap = nums
heapq.heapify(self.heap)
while len(self.heap) > k:
heapq.heappop(self.heap)
def add(self, val: int) -> int:
heapq.heappush(self.heap, val)
if len(self.heap) > self.k:
heapq.heappop(self.heap)
return self.heap[0]
#include <queue>
#include <vector>
class KthLargest {
int k;
std::priority_queue<int, std::vector<int>, std::greater<int>> minHeap;
public:
KthLargest(int k, std::vector<int>& nums) : k(k) {
for (int n : nums) {
minHeap.push(n);
if (minHeap.size() > k) minHeap.pop();
}
}
int add(int val) {
minHeap.push(val);
if (minHeap.size() > k) minHeap.pop();
return minHeap.top();
}
};
import java.util.PriorityQueue;
class KthLargest {
private PriorityQueue<Integer> minHeap;
private int k;
public KthLargest(int k, int[] nums) {
this.k = k;
minHeap = new PriorityQueue<>();
for (int n : nums) {
minHeap.offer(n);
if (minHeap.size() > k) minHeap.poll();
}
}
public int add(int val) {
minHeap.offer(val);
if (minHeap.size() > k) minHeap.poll();
return minHeap.peek();
}
}
class MinHeap {
constructor() {
this.heap = [];
}
push(val) {
this.heap.push(val);
this._siftUp();
}
pop() {
if (this.heap.length === 1) return this.heap.pop();
const top = this.heap[0];
this.heap[0] = this.heap.pop();
this._siftDown();
return top;
}
peek() {
return this.heap[0];
}
size() {
return this.heap.length;
}
_siftUp() {
let idx = this.heap.length - 1;
while (idx > 0) {
let parent = Math.floor((idx - 1) / 2);
if (this.heap[parent] <= this.heap[idx]) break;
[this.heap[parent], this.heap[idx]] = [this.heap[idx], this.heap[parent]];
idx = parent;
}
}
_siftDown() {
let idx = 0, len = this.heap.length;
while (true) {
let left = 2 * idx + 1, right = 2 * idx + 2, smallest = idx;
if (left < len && this.heap[left] < this.heap[smallest]) smallest = left;
if (right < len && this.heap[right] < this.heap[smallest]) smallest = right;
if (smallest === idx) break;
[this.heap[smallest], this.heap[idx]] = [this.heap[idx], this.heap[smallest]];
idx = smallest;
}
}
}
class KthLargest {
constructor(k, nums) {
this.k = k;
this.heap = new MinHeap();
for (let n of nums) {
this.heap.push(n);
if (this.heap.size() > k) this.heap.pop();
}
}
add(val) {
this.heap.push(val);
if (this.heap.size() > this.k) this.heap.pop();
return this.heap.peek();
}
}
The Kth Largest Element in a Stream problem demonstrates the power of using a min-heap to efficiently maintain the k
largest elements in a dynamic sequence. By always keeping only the k
largest numbers and discarding smaller ones, we ensure that each add
operation is fast and memory usage is low. This elegant solution is a great example of how the right data structure can dramatically simplify and optimize a problem.