Want Help Cracking FAANG?

(Then click this)

×
Back to Question Bank

1316. Distinct Echo Substrings - Leetcode Solution

Problem Description

Given a string text, find the number of distinct echo substrings in it. An echo substring is a non-empty substring of text that can be written as the concatenation of two identical substrings. In other words, for some substring s, if s = a + a for some non-empty string a, then s is an echo substring.

Constraints:

  • 1 ≤ text.length ≤ 2000
  • text consists only of lowercase English letters
The solution must count each distinct echo substring only once, even if it occurs multiple times in text.

Thought Process

To solve this problem, let's start by considering what an echo substring is. For any substring, if you can split it into two equal halves and both halves are the same, it's an echo substring. The brute-force approach is to check all possible substrings, split them in half, and compare the halves. However, this would be very slow for longer strings, because there are O(N2) substrings and each comparison can take O(N) time.

To optimize, we need a faster way to compare substrings. One common technique is to use rolling hash (e.g., Rabin-Karp) to compare substrings in constant time. This way, we can efficiently determine if two halves are equal without character-by-character comparison.

Another key insight is that we only need to check substrings of even length, since echo substrings by definition are made of two equal halves. We can use a set to store unique echo substrings and ensure we count each only once.

Solution Approach

Let's break down the algorithm step by step:

  1. Iterate over all possible substring lengths:
    • Since an echo substring must be made of two equal parts, we only consider even lengths (2, 4, 6, ..., N).
  2. For each starting position, check if the substring is an echo substring:
    • For each even length, slide a window over the string.
    • For a substring text[i:i+2k], check if text[i:i+k] is equal to text[i+k:i+2k].
  3. Use rolling hash for efficient substring comparison:
    • Precompute hash values for all prefixes.
    • Compare hashes of the two halves in O(1) time.
  4. Store found echo substrings in a set:
    • This ensures each distinct echo substring is counted only once.
  5. Return the size of the set as the final answer.

This approach is efficient and leverages hashing to avoid redundant character comparisons.

Example Walkthrough

Let's walk through an example with text = "abcabcabc".

  • Consider substrings of length 2: "ab", "bc", "ca", "ab", "bc", "ca", "ab", "bc". None of these are echo substrings (no way to split into two equal halves).
  • Substrings of length 4:
    • "abca": "ab" vs "ca" (not equal)
    • "bcab": "bc" vs "ab" (not equal)
    • "cabc": "ca" vs "bc" (not equal)
    • "abca": same as above
    • "bcab": same as above
    • "cabc": same as above
  • Substrings of length 6:
    • "abcabc": "abc" vs "abc" (equal), so "abcabc" is an echo substring.
    • "bcabca": "bca" vs "bca" (equal), so "bcabca" is an echo substring.
    • "cabcab": "cab" vs "cab" (equal), so "cabcab" is an echo substring.
    • "abcabc": already seen.
  • Substrings of length 8:
    • "abcabcab": "abca" vs "bcab" (not equal)
    • "bcabcabc": "bcab" vs "cabc" (not equal)
  • The set of distinct echo substrings is {"abcabc", "bcabca", "cabcab"}.

The answer is 3.

Time and Space Complexity

Brute-force approach:

  • There are O(N2) substrings.
  • Comparing two halves takes up to O(N) time per substring.
  • Total time: O(N3).
Optimized approach with rolling hash:
  • Still O(N2) substrings to consider.
  • Comparing two halves is O(1) using hash values.
  • Preprocessing hashes: O(N).
  • Total time: O(N2).
  • Space: O(N2) in the worst case for the set of substrings and hash storage.

The optimized solution is feasible for N up to 2000.

Summary

The problem asks us to find all distinct echo substrings in a given string. The key insight is that echo substrings have even length and can be split into two equal halves. By using a rolling hash, we can efficiently compare substring halves and avoid redundant work. The use of a set ensures we only count unique echo substrings. This approach is both efficient and elegant, leveraging classic string processing techniques to solve the problem within tight constraints.

Code Implementation

class Solution:
    def distinctEchoSubstrings(self, text: str) -> int:
        n = len(text)
        base, mod = 31, 10**9 + 7
        prefix = [0] * (n + 1)
        power = [1] * (n + 1)
        for i in range(n):
            prefix[i+1] = (prefix[i] * base + ord(text[i]) - ord('a') + 1) % mod
            power[i+1] = (power[i] * base) % mod

        def get_hash(l, r):
            return (prefix[r] - prefix[l] * power[r - l]) % mod

        seen = set()
        for length in range(2, n + 1, 2):  # even lengths only
            half = length // 2
            for i in range(n - length + 1):
                if get_hash(i, i + half) == get_hash(i + half, i + length):
                    seen.add(text[i:i+length])
        return len(seen)
      
class Solution {
public:
    int distinctEchoSubstrings(string text) {
        int n = text.size();
        const int base = 31, mod = 1e9 + 7;
        vector prefix(n + 1, 0), power(n + 1, 1);
        for (int i = 0; i < n; ++i) {
            prefix[i+1] = (prefix[i] * base + (text[i] - 'a' + 1)) % mod;
            power[i+1] = (power[i] * base) % mod;
        }
        auto get_hash = [&](int l, int r) {
            return (prefix[r] - prefix[l] * power[r - l] % mod + mod) % mod;
        };
        unordered_set<string> seen;
        for (int len = 2; len <= n; len += 2) {
            int half = len / 2;
            for (int i = 0; i + len <= n; ++i) {
                if (get_hash(i, i + half) == get_hash(i + half, i + len)) {
                    seen.insert(text.substr(i, len));
                }
            }
        }
        return seen.size();
    }
};
      
class Solution {
    public int distinctEchoSubstrings(String text) {
        int n = text.length();
        int base = 31, mod = 1000000007;
        long[] prefix = new long[n + 1];
        long[] power = new long[n + 1];
        power[0] = 1;
        for (int i = 0; i < n; ++i) {
            prefix[i+1] = (prefix[i] * base + (text.charAt(i) - 'a' + 1)) % mod;
            power[i+1] = (power[i] * base) % mod;
        }
        HashSet<String> seen = new HashSet<>();
        for (int len = 2; len <= n; len += 2) {
            int half = len / 2;
            for (int i = 0; i + len <= n; ++i) {
                long left = (prefix[i + half] - prefix[i] * power[half] % mod + mod) % mod;
                long right = (prefix[i + len] - prefix[i + half] * power[half] % mod + mod) % mod;
                if (left == right) {
                    seen.add(text.substring(i, i + len));
                }
            }
        }
        return seen.size();
    }
}
      
var distinctEchoSubstrings = function(text) {
    const n = text.length;
    const base = 31, mod = 1e9 + 7;
    const prefix = new Array(n + 1).fill(0);
    const power = new Array(n + 1).fill(1);
    for (let i = 0; i < n; ++i) {
        prefix[i+1] = (prefix[i] * base + (text.charCodeAt(i) - 97 + 1)) % mod;
        power[i+1] = (power[i] * base) % mod;
    }
    function get_hash(l, r) {
        return (prefix[r] - prefix[l] * power[r - l] % mod + mod) % mod;
    }
    const seen = new Set();
    for (let len = 2; len <= n; len += 2) {
        const half = len / 2;
        for (let i = 0; i + len <= n; ++i) {
            if (get_hash(i, i + half) === get_hash(i + half, i + len)) {
                seen.add(text.substring(i, i + len));
            }
        }
    }
    return seen.size;
};