class Solution:
def distinctSubseqII(self, s: str) -> int:
MOD = 10 ** 9 + 7
ends = [0] * 26 # ends[c] = # of subseqs ending with chr(c + ord('a'))
for ch in s:
idx = ord(ch) - ord('a')
total = sum(ends) % MOD
ends[idx] = (total + 1) % MOD
return sum(ends) % MOD
class Solution {
public:
int distinctSubseqII(string s) {
const int MOD = 1e9 + 7;
vector<int> ends(26, 0);
for (char ch : s) {
int idx = ch - 'a';
long total = 0;
for (int cnt : ends) total = (total + cnt) % MOD;
ends[idx] = (total + 1) % MOD;
}
long res = 0;
for (int cnt : ends) res = (res + cnt) % MOD;
return (int)res;
}
};
class Solution {
public int distinctSubseqII(String s) {
int MOD = 1000000007;
int[] ends = new int[26];
for (char ch : s.toCharArray()) {
int idx = ch - 'a';
int total = 0;
for (int cnt : ends) total = (total + cnt) % MOD;
ends[idx] = (total + 1) % MOD;
}
int res = 0;
for (int cnt : ends) res = (res + cnt) % MOD;
return res;
}
}
var distinctSubseqII = function(s) {
const MOD = 1e9 + 7;
let ends = Array(26).fill(0);
for (let ch of s) {
let idx = ch.charCodeAt(0) - 'a'.charCodeAt(0);
let total = ends.reduce((a, b) => (a + b) % MOD, 0);
ends[idx] = (total + 1) % MOD;
}
return ends.reduce((a, b) => (a + b) % MOD, 0);
};
Given a string s
, return the number of distinct non-empty subsequences of s
. Since the answer may be very large, return it modulo 10^9 + 7
.
s
by deleting some (or no) characters without changing the order of the remaining characters.1 <= s.length <= 2000
, s
consists only of lowercase English letters.
At first glance, the problem asks us to count all possible distinct subsequences of a string. A brute-force approach would involve generating all possible subsequences and then filtering out duplicates, but this quickly becomes infeasible as the input size grows. For a string of length n
, there are up to 2^n
possible subsequences, so we need a smarter way.
The key insight is to recognize patterns in how subsequences are formed as we process the string from left to right. For each new character, every existing subsequence can be extended by this character to form a new subsequence. However, we also have to avoid double-counting subsequences that end with the same character. This leads us to consider using dynamic programming, where we track the number of distinct subsequences ending with each character.
We use a dynamic programming approach. The idea is to keep an array ends
of size 26 (for each lowercase letter), where ends[c]
represents the number of distinct subsequences ending with character c
.
ends
as an array of zeros.
ch
in s
:
ends
.
ch
is total + 1
(the "+1" is for the subsequence containing only ch
itself).
ends[ch]
to this new value. This may overwrite previous counts for ch
, which is correct because new subsequences with the latest ch
may overlap with older ones, and we want only the most recent.
ends
.
10^9 + 7
at each step to avoid overflow.
This approach ensures we count each distinct subsequence exactly once, efficiently and without generating them explicitly.
Let's walk through the algorithm for the string s = "abc"
:
ends = [0]*26
.
'a'
:
ends['a'] = 1
'b'
:
ends['b'] = 2
'c'
:
ends['c'] = 4
ends['a'] + ends['b'] + ends['c'] = 1 + 2 + 4 = 7
The 7 distinct non-empty subsequences are: "a", "b", "c", "ab", "ac", "bc", "abc".
O(2^n)
time and space, which is infeasible for n = 2000
.
O(n * 26)
(for each character, we sum a 26-length array), which is effectively O(n)
since 26 is constant.O(26)
for the ends
array, i.e., O(1)
extra space.This makes the solution efficient and scalable for large input sizes.
The problem of counting distinct non-empty subsequences can be solved efficiently using dynamic programming. By tracking the number of subsequences ending with each character and updating these counts as we traverse the string, we avoid redundant calculations and exponential blowup. The key insight is to realize how new subsequences are formed and how to avoid double-counting. The final approach is both elegant and practical, leveraging simple data structures and modular arithmetic.