Want Help Cracking FAANG?

(Then click this)

×
Back to Question Bank

571. Find Median Given Frequency of Numbers - Leetcode Solution

Problem Description

Given a list of numbers and their frequencies, determine the median of the expanded list (i.e., if you were to write out each number as many times as its frequency, what would the median value be?). The input is given as two lists: nums (an ascending list of unique integers) and freq (where freq[i] is the frequency of nums[i]).

  • There is exactly one valid answer.
  • The input lists are of the same length.
  • The expanded list may be very large, so you must compute the median without explicitly constructing it.
  • The median is defined as the middle value if the total count is odd, or the average of the two middle values if even.

Thought Process

At first glance, it seems natural to expand the list by repeating each number according to its frequency, then simply sort and select the median. However, this approach is infeasible for large frequencies due to memory and time constraints.

To optimize, we realize that since nums is already sorted, and we know how many times each number appears, we can use prefix sums to efficiently determine the position of the median in the virtual expanded array. This allows us to find the median without ever building the full list.

The shift is from a brute-force "construct and scan" approach to a "prefix sum and locate" approach, leveraging the sorted order and cumulative frequencies.

Solution Approach

  1. Compute the total number of elements:
    • Sum all frequencies to get the total count, n.
  2. Build a prefix sum array of frequencies:
    • This tells us, for each index, how many numbers are in the expanded list up to that point.
  3. Determine the position(s) of the median:
    • If n is odd, the median is at position n // 2 (0-based index).
    • If n is even, the median is the average of the values at positions n // 2 - 1 and n // 2.
  4. Find the median value(s) using the prefix sum:
    • Iterate through the prefix sum to find which number contains the median position(s).
    • Since the numbers are sorted, the first prefix sum that exceeds the median index gives the corresponding number.
  5. Return the median:
    • If n is odd, return that value.
    • If n is even, return the average of the two found values.

This approach avoids explicit expansion and leverages the sorted order and cumulative frequency counts for efficiency.

Example Walkthrough

Consider nums = [1, 2, 3] and freq = [2, 3, 1]. The expanded list would be: [1, 1, 2, 2, 2, 3].

  1. Total elements: 2 + 3 + 1 = 6 (even)
  2. Prefix sums: [2, 5, 6]
    • First 2 elements are 1
    • Next 3 elements are 2 (positions 2, 3, 4)
    • Last element is 3 (position 5)
  3. Median positions: n//2 - 1 = 2 and n//2 = 3
    • Position 2: This is the third element, which is 2
    • Position 3: This is the fourth element, which is also 2
  4. Median: (2 + 2) / 2 = 2.0

Time and Space Complexity

  • Brute-force approach:
    • Time: O(N), where N is the total number of elements in the expanded list (could be huge).
    • Space: O(N), for the expanded list.
  • Optimized prefix sum approach:
    • Time: O(k), where k is the number of unique numbers (length of nums and freq).
    • Space: O(k), for the prefix sum array.
  • Why: We only iterate through the input arrays, never building the full expanded list.

Summary

By leveraging prefix sums and the sorted nature of nums, we can efficiently find the median of a virtual expanded list without ever constructing it. This approach is both time- and space-efficient, and highlights the power of cumulative counting for problems involving frequencies and positions.

Code Implementation

def find_median(nums, freq):
    n = sum(freq)
    prefix = []
    total = 0
    for f in freq:
        total += f
        prefix.append(total)
    def find_kth(k):
        for i, p in enumerate(prefix):
            if k < p:
                return nums[i]
        return -1  # Should not happen
    if n % 2 == 1:
        return float(find_kth(n // 2))
    else:
        left = find_kth(n // 2 - 1)
        right = find_kth(n // 2)
        return (left + right) / 2.0
      
#include <vector>
using namespace std;
double findMedian(vector<int>& nums, vector<int>& freq) {
    int n = 0;
    for (int f : freq) n += f;
    vector<int> prefix;
    int total = 0;
    for (int f : freq) {
        total += f;
        prefix.push_back(total);
    }
    auto find_kth = [&](int k) {
        for (int i = 0; i < prefix.size(); ++i) {
            if (k < prefix[i]) return nums[i];
        }
        return -1;
    };
    if (n % 2 == 1) {
        return (double)find_kth(n / 2);
    } else {
        int left = find_kth(n / 2 - 1);
        int right = find_kth(n / 2);
        return (left + right) / 2.0;
    }
}
      
public class Solution {
    public double findMedian(int[] nums, int[] freq) {
        int n = 0;
        for (int f : freq) n += f;
        int[] prefix = new int[freq.length];
        int total = 0;
        for (int i = 0; i < freq.length; ++i) {
            total += freq[i];
            prefix[i] = total;
        }
        int findKth(int k) {
            for (int i = 0; i < prefix.length; ++i) {
                if (k < prefix[i]) return nums[i];
            }
            return -1;
        }
        if (n % 2 == 1) {
            return (double)findKth(n / 2);
        } else {
            int left = findKth(n / 2 - 1);
            int right = findKth(n / 2);
            return (left + right) / 2.0;
        }
    }
}
      
function findMedian(nums, freq) {
    let n = freq.reduce((a, b) => a + b, 0);
    let prefix = [];
    let total = 0;
    for (let f of freq) {
        total += f;
        prefix.push(total);
    }
    function findKth(k) {
        for (let i = 0; i < prefix.length; i++) {
            if (k < prefix[i]) return nums[i];
        }
        return -1;
    }
    if (n % 2 === 1) {
        return findKth(Math.floor(n / 2));
    } else {
        let left = findKth(n / 2 - 1);
        let right = findKth(n / 2);
        return (left + right) / 2.0;
    }
}