Given a string text
, find the number of distinct echo substrings in it.
An echo substring is a non-empty substring of text
that can be written as the concatenation of two identical substrings. In other words, for some substring s
, if s = a + a
for some non-empty string a
, then s
is an echo substring.
Constraints:
text.length
≤ 2000text
consists only of lowercase English letterstext
.
To solve this problem, let's start by considering what an echo substring is. For any substring, if you can split it into two equal halves and both halves are the same, it's an echo substring. The brute-force approach is to check all possible substrings, split them in half, and compare the halves. However, this would be very slow for longer strings, because there are O(N2) substrings and each comparison can take O(N) time.
To optimize, we need a faster way to compare substrings. One common technique is to use rolling hash (e.g., Rabin-Karp) to compare substrings in constant time. This way, we can efficiently determine if two halves are equal without character-by-character comparison.
Another key insight is that we only need to check substrings of even length, since echo substrings by definition are made of two equal halves. We can use a set to store unique echo substrings and ensure we count each only once.
Let's break down the algorithm step by step:
text[i:i+2k]
, check if text[i:i+k]
is equal to text[i+k:i+2k]
.This approach is efficient and leverages hashing to avoid redundant character comparisons.
Let's walk through an example with text = "abcabcabc"
.
The answer is 3.
Brute-force approach:
The optimized solution is feasible for N up to 2000.
The problem asks us to find all distinct echo substrings in a given string. The key insight is that echo substrings have even length and can be split into two equal halves. By using a rolling hash, we can efficiently compare substring halves and avoid redundant work. The use of a set ensures we only count unique echo substrings. This approach is both efficient and elegant, leveraging classic string processing techniques to solve the problem within tight constraints.
class Solution:
def distinctEchoSubstrings(self, text: str) -> int:
n = len(text)
base, mod = 31, 10**9 + 7
prefix = [0] * (n + 1)
power = [1] * (n + 1)
for i in range(n):
prefix[i+1] = (prefix[i] * base + ord(text[i]) - ord('a') + 1) % mod
power[i+1] = (power[i] * base) % mod
def get_hash(l, r):
return (prefix[r] - prefix[l] * power[r - l]) % mod
seen = set()
for length in range(2, n + 1, 2): # even lengths only
half = length // 2
for i in range(n - length + 1):
if get_hash(i, i + half) == get_hash(i + half, i + length):
seen.add(text[i:i+length])
return len(seen)
class Solution {
public:
int distinctEchoSubstrings(string text) {
int n = text.size();
const int base = 31, mod = 1e9 + 7;
vector prefix(n + 1, 0), power(n + 1, 1);
for (int i = 0; i < n; ++i) {
prefix[i+1] = (prefix[i] * base + (text[i] - 'a' + 1)) % mod;
power[i+1] = (power[i] * base) % mod;
}
auto get_hash = [&](int l, int r) {
return (prefix[r] - prefix[l] * power[r - l] % mod + mod) % mod;
};
unordered_set<string> seen;
for (int len = 2; len <= n; len += 2) {
int half = len / 2;
for (int i = 0; i + len <= n; ++i) {
if (get_hash(i, i + half) == get_hash(i + half, i + len)) {
seen.insert(text.substr(i, len));
}
}
}
return seen.size();
}
};
class Solution {
public int distinctEchoSubstrings(String text) {
int n = text.length();
int base = 31, mod = 1000000007;
long[] prefix = new long[n + 1];
long[] power = new long[n + 1];
power[0] = 1;
for (int i = 0; i < n; ++i) {
prefix[i+1] = (prefix[i] * base + (text.charAt(i) - 'a' + 1)) % mod;
power[i+1] = (power[i] * base) % mod;
}
HashSet<String> seen = new HashSet<>();
for (int len = 2; len <= n; len += 2) {
int half = len / 2;
for (int i = 0; i + len <= n; ++i) {
long left = (prefix[i + half] - prefix[i] * power[half] % mod + mod) % mod;
long right = (prefix[i + len] - prefix[i + half] * power[half] % mod + mod) % mod;
if (left == right) {
seen.add(text.substring(i, i + len));
}
}
}
return seen.size();
}
}
var distinctEchoSubstrings = function(text) {
const n = text.length;
const base = 31, mod = 1e9 + 7;
const prefix = new Array(n + 1).fill(0);
const power = new Array(n + 1).fill(1);
for (let i = 0; i < n; ++i) {
prefix[i+1] = (prefix[i] * base + (text.charCodeAt(i) - 97 + 1)) % mod;
power[i+1] = (power[i] * base) % mod;
}
function get_hash(l, r) {
return (prefix[r] - prefix[l] * power[r - l] % mod + mod) % mod;
}
const seen = new Set();
for (let len = 2; len <= n; len += 2) {
const half = len / 2;
for (let i = 0; i + len <= n; ++i) {
if (get_hash(i, i + half) === get_hash(i + half, i + len)) {
seen.add(text.substring(i, i + len));
}
}
}
return seen.size;
};