Given a list of numbers and their frequencies, determine the median of the expanded list (i.e., if you were to write out each number as many times as its frequency, what would the median value be?). The input is given as two lists: nums
(an ascending list of unique integers) and freq
(where freq[i]
is the frequency of nums[i]
).
At first glance, it seems natural to expand the list by repeating each number according to its frequency, then simply sort and select the median. However, this approach is infeasible for large frequencies due to memory and time constraints.
To optimize, we realize that since nums
is already sorted, and we know how many times each number appears, we can use prefix sums to efficiently determine the position of the median in the virtual expanded array. This allows us to find the median without ever building the full list.
The shift is from a brute-force "construct and scan" approach to a "prefix sum and locate" approach, leveraging the sorted order and cumulative frequencies.
n
.n
is odd, the median is at position n // 2
(0-based index).n
is even, the median is the average of the values at positions n // 2 - 1
and n // 2
.n
is odd, return that value.n
is even, return the average of the two found values.This approach avoids explicit expansion and leverages the sorted order and cumulative frequency counts for efficiency.
Consider nums = [1, 2, 3]
and freq = [2, 3, 1]
. The expanded list would be: [1, 1, 2, 2, 2, 3]
.
n//2 - 1 = 2
and n//2 = 3
nums
and freq
).
By leveraging prefix sums and the sorted nature of nums
, we can efficiently find the median of a virtual expanded list without ever constructing it. This approach is both time- and space-efficient, and highlights the power of cumulative counting for problems involving frequencies and positions.
def find_median(nums, freq):
n = sum(freq)
prefix = []
total = 0
for f in freq:
total += f
prefix.append(total)
def find_kth(k):
for i, p in enumerate(prefix):
if k < p:
return nums[i]
return -1 # Should not happen
if n % 2 == 1:
return float(find_kth(n // 2))
else:
left = find_kth(n // 2 - 1)
right = find_kth(n // 2)
return (left + right) / 2.0
#include <vector>
using namespace std;
double findMedian(vector<int>& nums, vector<int>& freq) {
int n = 0;
for (int f : freq) n += f;
vector<int> prefix;
int total = 0;
for (int f : freq) {
total += f;
prefix.push_back(total);
}
auto find_kth = [&](int k) {
for (int i = 0; i < prefix.size(); ++i) {
if (k < prefix[i]) return nums[i];
}
return -1;
};
if (n % 2 == 1) {
return (double)find_kth(n / 2);
} else {
int left = find_kth(n / 2 - 1);
int right = find_kth(n / 2);
return (left + right) / 2.0;
}
}
public class Solution {
public double findMedian(int[] nums, int[] freq) {
int n = 0;
for (int f : freq) n += f;
int[] prefix = new int[freq.length];
int total = 0;
for (int i = 0; i < freq.length; ++i) {
total += freq[i];
prefix[i] = total;
}
int findKth(int k) {
for (int i = 0; i < prefix.length; ++i) {
if (k < prefix[i]) return nums[i];
}
return -1;
}
if (n % 2 == 1) {
return (double)findKth(n / 2);
} else {
int left = findKth(n / 2 - 1);
int right = findKth(n / 2);
return (left + right) / 2.0;
}
}
}
function findMedian(nums, freq) {
let n = freq.reduce((a, b) => a + b, 0);
let prefix = [];
let total = 0;
for (let f of freq) {
total += f;
prefix.push(total);
}
function findKth(k) {
for (let i = 0; i < prefix.length; i++) {
if (k < prefix[i]) return nums[i];
}
return -1;
}
if (n % 2 === 1) {
return findKth(Math.floor(n / 2));
} else {
let left = findKth(n / 2 - 1);
let right = findKth(n / 2);
return (left + right) / 2.0;
}
}