Want Help Cracking FAANG?

(Then click this)

×
Back to Question Bank

692. Top K Frequent Words - Leetcode Solution

Code Implementation

from collections import Counter
import heapq

class Solution:
    def topKFrequent(self, words, k):
        count = Counter(words)
        # Heap: (negative frequency, word) so that higher freq comes first, and lex order for ties
        heap = [(-freq, word) for word, freq in count.items()]
        heapq.heapify(heap)
        result = []
        for _ in range(k):
            result.append(heapq.heappop(heap)[1])
        return result
      
class Solution {
public:
    vector<string> topKFrequent(vector<string>& words, int k) {
        unordered_map<string, int> count;
        for (const auto& word : words) count[word]++;
        auto cmp = [](const pair<int, string>& a, const pair<int, string>& b) {
            if (a.first == b.first) return a.second > b.second;
            return a.first < b.first;
        };
        priority_queue<pair<int, string>, vector<pair<int, string>>, decltype(cmp)> pq(cmp);
        for (const auto& p : count) {
            pq.push({p.second, p.first});
        }
        vector<string> res;
        while (k-- > 0 && !pq.empty()) {
            res.push_back(pq.top().second);
            pq.pop();
        }
        return res;
    }
};
      
import java.util.*;

class Solution {
    public List<String> topKFrequent(String[] words, int k) {
        Map<String, Integer> count = new HashMap<>();
        for (String w : words) count.put(w, count.getOrDefault(w, 0) + 1);
        PriorityQueue<String> heap = new PriorityQueue<>(
            (a, b) -> count.get(a).equals(count.get(b)) ? a.compareTo(b) : count.get(b) - count.get(a)
        );
        heap.addAll(count.keySet());
        List<String> res = new ArrayList<>();
        for (int i = 0; i < k; i++) res.add(heap.poll());
        return res;
    }
}
      
var topKFrequent = function(words, k) {
    let count = {};
    for (let word of words) {
        count[word] = (count[word] || 0) + 1;
    }
    // Sort by frequency desc, then lex order asc
    let sorted = Object.keys(count).sort((a, b) => {
        if (count[b] === count[a]) return a.localeCompare(b);
        return count[b] - count[a];
    });
    return sorted.slice(0, k);
};
      

Problem Description

Given an array of strings words and an integer k, return the k most frequent words. The result must be sorted by frequency from highest to lowest. If two words have the same frequency, sort them by their lexicographical (dictionary) order.

  • Each word in words is a lowercase English string.
  • There is always at least one valid solution.
  • Do not reuse elements; only count each occurrence.
  • The answer must have exactly k words.

Thought Process

The first idea is to count how many times each word appears in the list. This is a classic frequency counting problem, which can be solved using a hash map or dictionary. Once we know the frequency of each word, we need to find the top k most frequent words. The tricky part is handling ties: if two words have the same frequency, we should return the word that comes first alphabetically.

A brute-force approach would be to count the frequencies, then sort all words by their frequency and, in case of a tie, by lexicographical order. However, sorting the entire list of unique words can be slow if there are many unique words. To optimize, we can use a heap (priority queue) to keep track of the top k words efficiently.

Solution Approach

  • Step 1: Count Frequencies
    Use a hash map (dictionary) to count how many times each word appears in the input array. This provides O(1) time lookup and update per word.
  • Step 2: Build a Heap (Priority Queue)
    Since we want the top k frequent words, we can use a heap to efficiently get the most frequent words. In Python, we use a min-heap by default, but we can simulate a max-heap by pushing negative frequencies. In C++ and Java, custom comparators can be used.
  • Step 3: Tie-breaking with Lexicographical Order
    When two words have the same frequency, we compare them alphabetically. This is handled by the tuple (frequency, word) in the heap, where the frequency is the primary key (descending), and the word is the secondary key (ascending).
  • Step 4: Extract Top K Words
    Pop or select the top k elements from the heap or sorted list to get the answer.
  • Step 5: Return the Result
    Return the list of words in the required order.

This approach ensures we never sort more than necessary, and always break ties in the correct way.

Example Walkthrough

Sample Input:
words = ["i", "love", "leetcode", "i", "love", "coding"]
k = 2

  1. Count frequencies:
    • "i" appears 2 times
    • "love" appears 2 times
    • "leetcode" appears 1 time
    • "coding" appears 1 time
  2. Build heap or sort:
    Sort by frequency descending, then lexicographically:
    • "i": 2
    • "love": 2
    • "coding": 1
    • "leetcode": 1
    Between "i" and "love" (both 2), "i" comes before "love" alphabetically.
  3. Top 2 words:
    ["i", "love"]

The output is ["i", "love"], which matches the problem's requirements.

Time and Space Complexity

  • Brute-force Approach:
    - Counting frequencies: O(N), where N is the number of words.
    - Sorting all unique words: O(M log M), where M is the number of unique words.
    - Total: O(N + M log M)
  • Optimized Heap Approach:
    - Counting frequencies: O(N)
    - Building the heap: O(M)
    - Extracting k elements: O(k log M)
    - Total: O(N + M + k log M), which is often faster than sorting all M words.
  • Space Complexity:
    - O(M) for the frequency map and heap.

Here, N is the length of the input list, and M is the number of unique words.

Summary

The "Top K Frequent Words" problem combines frequency counting with custom sorting. By using a hash map for counting and a heap or custom sort for extracting the top k frequent words (with lexicographical tie-breaking), we achieve an efficient and elegant solution. The main insight is to use data structures that support fast frequency lookups and ordering, making this a great example of how hash maps and heaps can be applied together in algorithmic problems.