from collections import defaultdict, deque
class Solution:
def numMatchingSubseq(self, s: str, words: list[str]) -> int:
waiting = defaultdict(deque)
for word in words:
waiting[word[0]].append(iter(word[1:]))
count = 0
for c in s:
old_waiting = waiting[c]
waiting[c] = deque()
for it in old_waiting:
nxt = next(it, None)
if nxt:
waiting[nxt].append(it)
else:
count += 1
return count
#include <vector>
#include <string>
#include <unordered_map>
#include <queue>
using namespace std;
class Solution {
public:
int numMatchingSubseq(string s, vector<string>& words) {
unordered_map<char, queue<pair<string::iterator, string::iterator>>> waiting;
for (const string& word : words) {
waiting[word[0]].emplace(word.begin() + 1, word.end());
}
int count = 0;
for (char c : s) {
auto old_queue = std::move(waiting[c]);
while (!old_queue.empty()) {
auto it_pair = old_queue.front(); old_queue.pop();
if (it_pair.first == it_pair.second) {
++count;
} else {
waiting[*it_pair.first].emplace(it_pair.first + 1, it_pair.second);
}
}
}
return count;
}
};
import java.util.*;
class Solution {
public int numMatchingSubseq(String s, String[] words) {
Map<Character, Queue<Iterator<Character>>> waiting = new HashMap<>();
for (char c = 'a'; c <= 'z'; c++) waiting.put(c, new LinkedList<>());
for (String word : words) {
waiting.get(word.charAt(0)).offer(word.chars().skip(1).mapToObj(i -> (char)i).iterator());
}
int count = 0;
for (char c : s.toCharArray()) {
Queue<Iterator<Character>> old = waiting.get(c);
int size = old.size();
for (int i = 0; i < size; i++) {
Iterator<Character> it = old.poll();
if (it.hasNext()) {
waiting.get(it.next()).offer(it);
} else {
count++;
}
}
}
return count;
}
}
var numMatchingSubseq = function(s, words) {
const waiting = {};
for (let i = 0; i < 26; i++) waiting[String.fromCharCode(97 + i)] = [];
for (const word of words) {
waiting[word[0]].push([word, 1]);
}
let count = 0;
for (const c of s) {
const old = waiting[c];
waiting[c] = [];
for (const [word, idx] of old) {
if (idx === word.length) {
count++;
} else {
waiting[word[idx]].push([word, idx + 1]);
}
}
}
return count;
};
You are given a string s
and an array of strings words
. Your task is to determine how many words in words
are subsequences of s
.
s
if you can delete some (or no) characters from s
without changing the order of the remaining characters to form the word.words
should be checked independently.s
for the same position in a word.Constraints:
words
must be checked separately; do not reuse elements.
The most straightforward way to solve this problem is to check, for each word, if it appears as a subsequence in s
. This can be done by iterating through s
and trying to match the characters of the word in order. However, this brute-force approach, especially if s
and words
are large, can be inefficient.
To optimize, we need to avoid redundant work. Notice that many words may share prefixes, and many words may be waiting for the same next character in s
. If we can process all words waiting for the same character together as we scan through s
, we can significantly reduce the number of operations.
The key insight is to use a "bucket" or "waiting list" for each character. Each bucket keeps track of which words (or their iterators) are waiting for that character next. As we process each character in s
, we update the waiting lists accordingly.
Here is a step-by-step breakdown of the optimized algorithm:
words
, put it (or an iterator/pointer to its next character) into a waiting list keyed by the first character it needs to match.s
:
c
in s
(left to right):c
.s
, return the total count of words that matched as subsequences.This approach is efficient because each character in each word is processed exactly once, and we leverage hash maps (or arrays) for fast lookups.
Sample Input:
s = "abcde"
words = ["a", "bb", "acd", "ace"]
s
:
"bb" is not a subsequence (no further 'b' in s
), so the final answer is 3.
s
to check if it is a subsequence.s
.s
.s
.
The optimized approach is much faster, especially when words
contains many strings or s
is long.
In this problem, the key insight is to process all words in words
in a way that minimizes redundant work by grouping them according to the next character they need to match. By using a hash map (or array) of waiting lists for each character, we efficiently move iterators through the words as we scan s
, counting matches as we go. This elegant solution leverages data structures to turn a potentially slow brute-force approach into a highly efficient one, making it suitable for large inputs.