1032. Stream of Characters - Leetcode Solution

Problem Description

The "Stream of Characters" problem asks you to implement a data structure that can efficiently determine, after each character input, whether the current sequence of the most recent characters forms any word from a given list of words.

You are given a list of strings words at initialization.
You receive a stream of lowercase letters one by one via a query(letter) function.
After each call to query, return true if the sequence of the most recent characters (ending with the newest letter) forms any word in words as a suffix; otherwise, return false.
Each word in words can be used multiple times, and the stream can be very long.

Constraints:

1 ≤ words.length ≤ 2000
1 ≤ words[i].length ≤ 200
Letters are lowercase English letters.
Queries can be up to 40,000 calls.

Thought Process

At first glance, the problem seems to require checking, after each new letter, whether any suffix of the current stream matches any of the words in the list. A brute-force approach would be to, on each query, compare all possible suffixes of the current stream to all words. But with up to 40,000 queries and 2,000 words (each up to 200 characters), this would be far too slow.

We need a way to efficiently check whether the most recent characters form any word in the list, ideally in constant or near-constant time per query. This suggests using a data structure that can quickly check for suffix matches as the stream grows.

The challenge is that, unlike prefix matching (which is well-served by a standard Trie), we need to check for suffixes. This leads to the idea of reversing all words and building a Trie of the reversed words, so the problem becomes prefix-matching on the reversed stream.

Solution Approach

Here’s how we can solve the problem efficiently:

Reverse all words: For each word in words, reverse it. This lets us check for suffixes by matching prefixes on the reversed stream.
Build a Trie: Insert each reversed word into a Trie (prefix tree). Each node in the Trie represents a possible character in the reversed word, and nodes are marked as "end of word" appropriately.
Maintain a stream buffer: As each letter is queried, keep a buffer of the most recent letters (up to the length of the longest word).
Query efficiently: On each query, append the new letter to the buffer. Then, starting from the most recent letter and moving backward, walk the Trie according to the buffer. If you reach a Trie node marked as "end of word" during this traversal, return true; otherwise, return false.
Optimization: Only keep as many recent letters as the length of the longest word, since no word can be longer than that.

This approach ensures that each query is handled in time proportional to the maximum word length, not the total number of words or stream length.

Example Walkthrough

Suppose words = ["cd", "f", "kl"]. We reverse and build a Trie for ["dc", "f", "lk"].

Query('a'): Buffer = ['a']. No match in Trie. Return false.
Query('b'): Buffer = ['a', 'b']. Try "b", then "ba" — no match. Return false.
Query('c'): Buffer = ['a', 'b', 'c']. Try "c" (no), "bc" (no), "abc" (no). Return false.
Query('d'): Buffer = ['a', 'b', 'c', 'd']. Try "d" (no), "cd" — matches "cd" (since "dc" is in Trie as reversed "cd"). Return true.
Query('f'): Buffer = ['a', 'b', 'c', 'd', 'f']. Try "f" — matches. Return true.
Query('g'): Buffer = ['a', 'b', 'c', 'd', 'f', 'g']. No matches. Return false.
Query('k'): Buffer = ['a', 'b', 'c', 'd', 'f', 'g', 'k']. No matches. Return false.
Query('l'): Buffer = ['a', 'b', 'c', 'd', 'f', 'g', 'k', 'l']. Try "l", "kl" — matches "kl". Return true.

Time and Space Complexity

Brute-force approach:
- Each query could check all possible suffixes (up to 200 characters) against all 2000 words: O(W * L) per query, where W is word count and L is word length.
- This is far too slow for large inputs.
Optimized Trie approach:
- Initialization: Building the Trie takes O(W * L) time and space (each word is inserted character by character).
- Per query: Each query walks the Trie for at most L steps (the longest word), so O(L) per query.
- Space: Trie uses O(W * L) space; the buffer uses O(L) space.
Summary: This approach is efficient and scales well with the problem constraints.

Summary

In this problem, we efficiently check for word suffixes in a character stream by reversing all words and building a Trie for quick prefix matching on the reversed stream. This allows us to answer each query in time proportional to the longest word, making the solution both elegant and performant. The approach highlights the power of Tries for string matching problems, especially when combined with clever transformations like reversing for suffix checks.

Code Implementation

class TrieNode:
    def __init__(self):
        self.children = {}
        self.is_word = False

class StreamChecker:

    def __init__(self, words):
        self.root = TrieNode()
        self.max_len = 0
        for word in words:
            self.max_len = max(self.max_len, len(word))
            node = self.root
            for c in reversed(word):
                if c not in node.children:
                    node.children[c] = TrieNode()
                node = node.children[c]
            node.is_word = True
        self.stream = []

    def query(self, letter):
        self.stream.append(letter)
        if len(self.stream) > self.max_len:
            self.stream.pop(0)
        node = self.root
        for c in reversed(self.stream):
            if c not in node.children:
                return False
            node = node.children[c]
            if node.is_word:
                return True
        return False

class TrieNode {
public:
    TrieNode* children[26];
    bool is_word;
    TrieNode() : is_word(false) {
        memset(children, 0, sizeof(children));
    }
};

class StreamChecker {
    TrieNode* root;
    int max_len;
    std::deque<char> stream;
public:
    StreamChecker(vector<string>& words) {
        root = new TrieNode();
        max_len = 0;
        for (const auto& word : words) {
            max_len = max(max_len, (int)word.size());
            TrieNode* node = root;
            for (int i = word.size() - 1; i >= 0; --i) {
                int idx = word[i] - 'a';
                if (!node->children[idx])
                    node->children[idx] = new TrieNode();
                node = node->children[idx];
            }
            node->is_word = true;
        }
    }
    
    bool query(char letter) {
        stream.push_back(letter);
        if (stream.size() > max_len)
            stream.pop_front();
        TrieNode* node = root;
        for (auto it = stream.rbegin(); it != stream.rend(); ++it) {
            int idx = *it - 'a';
            if (!node->children[idx])
                return false;
            node = node->children[idx];
            if (node->is_word)
                return true;
        }
        return false;
    }
};

class TrieNode {
    TrieNode[] children = new TrieNode[26];
    boolean isWord = false;
}

class StreamChecker {
    TrieNode root = new TrieNode();
    int maxLen = 0;
    Deque<Character> stream = new ArrayDeque<>();

    public StreamChecker(String[] words) {
        for (String word : words) {
            maxLen = Math.max(maxLen, word.length());
            TrieNode node = root;
            for (int i = word.length() - 1; i >= 0; --i) {
                int idx = word.charAt(i) - 'a';
                if (node.children[idx] == null)
                    node.children[idx] = new TrieNode();
                node = node.children[idx];
            }
            node.isWord = true;
        }
    }
    
    public boolean query(char letter) {
        stream.addLast(letter);
        if (stream.size() > maxLen)
            stream.removeFirst();
        TrieNode node = root;
        Iterator<Character> it = stream.descendingIterator();
        while (it.hasNext()) {
            char c = it.next();
            int idx = c - 'a';
            if (node.children[idx] == null)
                return false;
            node = node.children[idx];
            if (node.isWord)
                return true;
        }
        return false;
    }
}

class TrieNode {
    constructor() {
        this.children = {};
        this.isWord = false;
    }
}

class StreamChecker {
    constructor(words) {
        this.root = new TrieNode();
        this.maxLen = 0;
        for (let word of words) {
            this.maxLen = Math.max(this.maxLen, word.length);
            let node = this.root;
            for (let i = word.length - 1; i >= 0; --i) {
                let c = word[i];
                if (!node.children[c])
                    node.children[c] = new TrieNode();
                node = node.children[c];
            }
            node.isWord = true;
        }
        this.stream = [];
    }
    
    query(letter) {
        this.stream.push(letter);
        if (this.stream.length > this.maxLen)
            this.stream.shift();
        let node = this.root;
        for (let i = this.stream.length - 1; i >= 0; --i) {
            let c = this.stream[i];
            if (!node.children[c])
                return false;
            node = node.children[c];
            if (node.isWord)
                return true;
        }
        return false;
    }
}

AlgoMap.io

Want Help Cracking FAANG?