Want Help Cracking FAANG?

(Then click this)

×
Back to Question Bank

1062. Longest Repeating Substring - Leetcode Solution

Problem Description

Given a string S, find the length of the longest substring that appears at least twice in S. The two occurrences of the substring may overlap. If there is no such repeating substring, return 0.

Constraints:

  • The input string S consists only of lowercase English letters.
  • The length of S is between 1 and 1500, inclusive.
Example:
Input: S = "abcd"
Output: 0

Input: S = "abbaba"
Output: 2
    

You must find the length of the longest substring that repeats at least twice in S. Substrings must be contiguous, and the repeated occurrences can overlap.

Thought Process

The naive approach is to check every possible substring and see if it appears at least twice. However, this is highly inefficient, especially for longer strings, because there are a large number of substrings, and checking each one would take too long.

To improve, we can think about ways to efficiently search for repeating substrings. One way is to use hashing (such as Rabin-Karp) to quickly compare substrings of the same length. Another idea is to use binary search to find the maximum possible length of a repeating substring, which can reduce the number of checks we need to make.

The key insight is that for a given length L, we can efficiently check if any substring of length L appears more than once using a set or hash map to store seen substrings. If we can check for a given L quickly, we can use binary search to find the largest such L.

Solution Approach

We'll use a combination of binary search and hashing to solve the problem efficiently:

  1. Binary Search Over Substring Length:
    • Set two pointers: left = 1 and right = len(S) - 1. The answer must be between these values.
    • While left <= right, do:
      • Set mid = (left + right) // 2. Check if there is any substring of length mid that occurs more than once in S.
      • If yes, it means there might be a longer repeating substring, so set left = mid + 1.
      • If not, try shorter substrings, so set right = mid - 1.
  2. Checking for Repeats Using Hashing:
    • For a given length L, slide a window of size L over the string.
    • Hash each substring and store the hashes in a set.
    • If a hash appears twice, we've found a repeated substring of length L.
    • To avoid hash collisions, we can use a rolling hash (Rabin-Karp) or simply compare the actual substrings if a hash collision is detected.
  3. Result:
    • The maximum length L for which a repeat exists is our answer.

This approach is efficient because binary search reduces the number of lengths to check (logarithmic in the length of the string), and hashing allows us to check for repeats among all substrings of a given length in linear time.

Example Walkthrough

Let's walk through the input S = "abbaba":

  1. The possible substring lengths to check are from 1 to 5.
  2. We start with binary search: left = 1, right = 5.
  3. First iteration: mid = 3.
    • Substrings of length 3: "abb", "bba", "bab", "aba".
    • All are unique, so no repeat. Set right = 2.
  4. Second iteration: mid = 1.
    • Substrings: "a", "b", "b", "a", "b", "a".
    • "a" and "b" both appear more than once. So, set left = 2.
  5. Third iteration: mid = 2.
    • Substrings: "ab", "bb", "ba", "ab", "ba".
    • "ab" and "ba" both appear more than once. Set left = 3.
  6. Now left > right, so the answer is the last found length with repeats, which is 2.

This shows how binary search narrows down the possible lengths, and hashing allows us to check for repeats efficiently.

Time and Space Complexity

  • Brute-force Approach:
    • For every substring, check if it appears more than once. There are O(N2) substrings, and checking each can take O(N), for a total of O(N3).
  • Optimized Approach (Binary Search + Hashing):
    • Binary search runs in O(log N) iterations.
    • For each length, we check all substrings using a rolling hash in O(N) time.
    • Total time complexity: O(N log N).
    • Space complexity: O(N), for storing hashes of substrings.

The optimized approach is much more efficient and suitable for the problem's constraints.

Summary

To find the length of the longest repeating substring in a string, we use binary search to efficiently narrow down the possible lengths and hashing to quickly check for repeated substrings of a given length. This combination leverages the strengths of both algorithms, making the solution fast and effective for large inputs. The main insight is to treat the problem as a search over substring lengths, and to use efficient data structures for checking repeats.

Code Implementation

class Solution:
    def longestRepeatingSubstring(self, S: str) -> int:
        def search(L):
            seen = set()
            base = 256
            mod = 2**63 - 1
            hash_val = 0
            for i in range(L):
                hash_val = (hash_val * base + ord(S[i])) % mod
            seen.add(hash_val)
            baseL = pow(base, L, mod)
            for i in range(1, len(S) - L + 1):
                hash_val = (hash_val * base - ord(S[i-1]) * baseL + ord(S[i+L-1])) % mod
                if hash_val in seen:
                    return True
                seen.add(hash_val)
            return False

        left, right = 1, len(S) - 1
        ans = 0
        while left <= right:
            mid = (left + right) // 2
            if search(mid):
                ans = mid
                left = mid + 1
            else:
                right = mid - 1
        return ans
      
class Solution {
public:
    int longestRepeatingSubstring(string S) {
        int n = S.length();
        int left = 1, right = n - 1, ans = 0;
        while (left <= right) {
            int mid = (left + right) / 2;
            if (hasRepeat(S, mid)) {
                ans = mid;
                left = mid + 1;
            } else {
                right = mid - 1;
            }
        }
        return ans;
    }
private:
    bool hasRepeat(const string& S, int L) {
        unordered_set<long long> seen;
        long long hash = 0, base = 256, mod = (1LL << 61) - 1;
        long long baseL = 1;
        for (int i = 0; i < L; ++i) {
            hash = (hash * base + S[i]) % mod;
            baseL = (baseL * base) % mod;
        }
        seen.insert(hash);
        for (int i = 1; i + L <= S.size(); ++i) {
            hash = (hash * base - S[i-1] * baseL % mod + mod) % mod;
            hash = (hash + S[i+L-1]) % mod;
            if (seen.count(hash)) return true;
            seen.insert(hash);
        }
        return false;
    }
};
      
class Solution {
    public int longestRepeatingSubstring(String S) {
        int n = S.length();
        int left = 1, right = n - 1, ans = 0;
        while (left <= right) {
            int mid = (left + right) / 2;
            if (hasRepeat(S, mid)) {
                ans = mid;
                left = mid + 1;
            } else {
                right = mid - 1;
            }
        }
        return ans;
    }
    private boolean hasRepeat(String S, int L) {
        HashSet<Long> seen = new HashSet<>();
        long hash = 0, base = 256, mod = (1L << 61) - 1;
        long baseL = 1;
        for (int i = 0; i < L; ++i) {
            hash = (hash * base + S.charAt(i)) % mod;
            baseL = (baseL * base) % mod;
        }
        seen.add(hash);
        for (int i = 1; i + L <= S.length(); ++i) {
            hash = (hash * base - S.charAt(i-1) * baseL % mod + mod) % mod;
            hash = (hash + S.charAt(i+L-1)) % mod;
            if (seen.contains(hash)) return true;
            seen.add(hash);
        }
        return false;
    }
}
      
var longestRepeatingSubstring = function(S) {
    function search(L) {
        let seen = new Set();
        let base = 256, mod = 2 ** 53 - 1;
        let hash = 0;
        for (let i = 0; i < L; ++i) {
            hash = (hash * base + S.charCodeAt(i)) % mod;
        }
        seen.add(hash);
        let baseL = 1;
        for (let i = 0; i < L; ++i) baseL = (baseL * base) % mod;
        for (let i = 1; i + L <= S.length; ++i) {
            hash = (hash * base - S.charCodeAt(i-1) * baseL % mod + mod) % mod;
            hash = (hash + S.charCodeAt(i+L-1)) % mod;
            if (seen.has(hash)) return true;
            seen.add(hash);
        }
        return false;
    }
    let left = 1, right = S.length - 1, ans = 0;
    while (left <= right) {
        let mid = Math.floor((left + right) / 2);
        if (search(mid)) {
            ans = mid;
            left = mid + 1;
        } else {
            right = mid - 1;
        }
    }
    return ans;
};