import random
import bisect
from collections import defaultdict
class MajorityChecker:
def __init__(self, arr):
self.arr = arr
self.idx_map = defaultdict(list)
for i, num in enumerate(arr):
self.idx_map[num].append(i)
def query(self, left, right, threshold):
for _ in range(20):
rand_idx = random.randint(left, right)
candidate = self.arr[rand_idx]
idx_list = self.idx_map[candidate]
l = bisect.bisect_left(idx_list, left)
r = bisect.bisect_right(idx_list, right)
if r - l >= threshold:
return candidate
return -1
#include <vector>
#include <unordered_map>
#include <algorithm>
#include <random>
using namespace std;
class MajorityChecker {
vector<int> arr;
unordered_map<int, vector<int>> idx_map;
mt19937 rng;
public:
MajorityChecker(vector<int>& arr): arr(arr), rng(random_device{}()) {
for (int i = 0; i < arr.size(); ++i)
idx_map[arr[i]].push_back(i);
}
int query(int left, int right, int threshold) {
uniform_int_distribution<int> dist(left, right);
for (int t = 0; t < 20; ++t) {
int rand_idx = dist(rng);
int candidate = arr[rand_idx];
const vector<int>& idx_list = idx_map[candidate];
auto l = lower_bound(idx_list.begin(), idx_list.end(), left);
auto r = upper_bound(idx_list.begin(), idx_list.end(), right);
if (r - l >= threshold)
return candidate;
}
return -1;
}
};
import java.util.*;
class MajorityChecker {
int[] arr;
Map<Integer, List<Integer>> idxMap;
Random rand;
public MajorityChecker(int[] arr) {
this.arr = arr;
idxMap = new HashMap<>();
rand = new Random();
for (int i = 0; i < arr.length; ++i) {
idxMap.computeIfAbsent(arr[i], k -> new ArrayList<>()).add(i);
}
}
public int query(int left, int right, int threshold) {
for (int t = 0; t < 20; ++t) {
int randIdx = left + rand.nextInt(right - left + 1);
int candidate = arr[randIdx];
List<Integer> idxList = idxMap.get(candidate);
int l = Collections.binarySearch(idxList, left);
int r = Collections.binarySearch(idxList, right + 1);
if (l < 0) l = -l - 1;
if (r < 0) r = -r - 1;
if (r - l >= threshold)
return candidate;
}
return -1;
}
}
class MajorityChecker {
constructor(arr) {
this.arr = arr;
this.idxMap = new Map();
for (let i = 0; i < arr.length; ++i) {
if (!this.idxMap.has(arr[i])) this.idxMap.set(arr[i], []);
this.idxMap.get(arr[i]).push(i);
}
}
query(left, right, threshold) {
for (let t = 0; t < 20; ++t) {
let randIdx = left + Math.floor(Math.random() * (right - left + 1));
let candidate = this.arr[randIdx];
let idxList = this.idxMap.get(candidate);
let l = this.lowerBound(idxList, left);
let r = this.upperBound(idxList, right);
if (r - l >= threshold) return candidate;
}
return -1;
}
lowerBound(arr, target) {
let l = 0, r = arr.length;
while (l < r) {
let m = (l + r) >> 1;
if (arr[m] < target) l = m + 1;
else r = m;
}
return l;
}
upperBound(arr, target) {
let l = 0, r = arr.length;
while (l < r) {
let m = (l + r) >> 1;
if (arr[m] <= target) l = m + 1;
else r = m;
}
return l;
}
}
You are given an array arr
of integers. You need to design a data structure that, given queries of the form (left, right, threshold)
, efficiently returns the majority element in the subarray arr[left..right]
(inclusive), if it occurs at least threshold
times. If no such element exists, return -1
.
threshold
times within the range [left, right]
.
1 <= arr.length <= 2 * 10^4
1 <= arr[i] <= 10^9
queries
may be up to 10^4
The solution must be efficient enough to handle many queries, and brute-force enumeration of the subarray for each query is too slow.
Let's start by considering the brute-force approach: for each query, count the frequency of every element in the subarray arr[left..right]
. If any element occurs at least threshold
times, return it; otherwise, return -1
.
However, this approach is inefficient, especially for large arrays and many queries, because each query could take O(N) time. We need to optimize.
Key insight: The majority element (if one exists) must appear frequently in the subarray. If we randomly pick indices in the subarray, the majority element is more likely to be picked than others. This suggests a probabilistic (random sampling) approach.
To check the frequency of a candidate quickly, we can precompute and store the list of indices for each value in arr
. Then, we can binary search this list to count how many times the candidate appears in the queried range.
This approach combines random sampling (to efficiently guess a likely majority) with fast lookup (using binary search on precomputed positions).
The solution uses a combination of pre-processing and random sampling to answer queries efficiently.
arr
to a sorted list of its indices. This allows us to quickly count how many times a value appears in any subarray using binary search.
(left, right, threshold)
, repeat the following up to 20 times:
left
and right
.candidate
be the value at that index.candidate
, use binary search to count how many times candidate
appears in [left, right]
.threshold
, return candidate
as the answer.-1
.
Suppose arr = [1,2,1,2,1,2,2]
and we receive the query (0, 6, 4)
.
[1,2,1,2,1,2,2]
.
[1,3,5,6]
.
[0,2,4]
, which has only 3 occurrences in the range, which is less than threshold.
arr
, to build the index map.
The problem of finding the majority element in an online subarray query is efficiently solved by combining random sampling with precomputed index lists for fast frequency checks. This approach leverages the high probability of sampling a majority element and uses binary search for quick counting. The result is a solution that is both elegant and performant, able to handle large arrays and many queries with ease.