Want Help Cracking FAANG?

(Then click this)

×
Back to Question Bank

1794. Count Pairs of Equal Substrings With Minimum Difference - Leetcode Solution

Problem Description

Given a string s and an integer k, find the number of pairs of equal substrings of length k such that the absolute difference between their starting indices is minimized (i.e., for each unique substring, count the number of pairs of occurrences with the smallest possible index difference).

Specifically, for every substring of length k that appears more than once in s, consider all pairs of starting indices (i, j) such that s[i:i+k] == s[j:j+k] and i < j. For each such substring, only count the pair(s) with the minimum absolute difference in indices. Sum this count over all unique substrings of length k.

  • Each pair must consist of two distinct starting indices.
  • Do not reuse the same occurrence of a substring in multiple pairs for the same minimal difference.
  • Return the total number of such pairs across all substrings.

Thought Process

The naive approach is to check every pair of substrings of length k and count those that are equal, then find which pairs have the smallest index difference for each substring. However, this is inefficient because the number of substrings grows rapidly as the input size increases.

To optimize, we can:

  • Group all starting indices for each unique substring of length k using a hash map.
  • For each group, sort the indices and find all pairs with the smallest gap between them.
  • Count how many times this minimal gap occurs for each substring and sum these counts.
This leverages the fact that comparing only consecutive indices in the sorted list gives us the minimal difference efficiently.

Solution Approach

Let's break down the steps:

  1. Extract all substrings of length k:
    • Iterate over s and, for each position i, extract s[i:i+k].
    • Store the starting index i in a hash map where the key is the substring.
  2. Process each group of indices:
    • For each substring with more than one occurrence, sort its list of indices.
    • Compute the differences between each pair of consecutive indices.
    • Find the minimum difference among these.
    • Count how many times this minimum difference occurs (i.e., how many consecutive pairs have this gap).
  3. Sum up the counts:
    • For each substring, only the pairs with the minimal difference are counted.
    • Add up these counts for all substrings to get the final answer.

This method is efficient because it avoids redundant comparisons and leverages efficient data structures for grouping and searching.

Example Walkthrough

Suppose s = "ababcab" and k = 2.

  • All substrings of length 2: ["ab", "ba", "ab", "bc", "ca", "ab"] at indices 0,1,2,3,4,5
  • Grouping indices by substring:
    • "ab": [0,2,5]
    • "ba": [1]
    • "bc": [3]
    • "ca": [4]
  • For "ab", indices = [0,2,5]:
    • Differences: 2-0=2, 5-2=3
    • Minimum difference = 2 (between indices 0 and 2)
    • Only one pair (0,2) has this minimum difference
  • Other substrings appear only once, so no pairs.
  • Final answer: 1 (only one minimal-difference pair for "ab")

Time and Space Complexity

  • Brute-force:
    • Time: O(N^2), since we compare every pair of substrings of length k.
    • Space: O(N), for storing substrings.
  • Optimized approach:
    • Time: O(N * k), to extract all substrings and group indices. For each group, sorting indices is O(M log M) where M is the number of occurrences, but sum over all groups is at most O(N log N).
    • Space: O(N), for the hash map storing indices.

The optimized approach is efficient for large inputs, as it avoids unnecessary comparisons.

Summary

This problem challenges us to efficiently count pairs of equal substrings with the smallest possible index difference. By grouping substring occurrences and focusing on minimal gaps between indices, we avoid brute-force inefficiency. The key insight is that, for each unique substring, consecutive occurrences determine the minimal difference, leading to an elegant and performant solution.

Code Implementation

from collections import defaultdict

def count_min_diff_pairs(s: str, k: int) -> int:
    substr_indices = defaultdict(list)
    n = len(s)
    for i in range(n - k + 1):
        substr = s[i:i+k]
        substr_indices[substr].append(i)
    total = 0
    for indices in substr_indices.values():
        if len(indices) < 2:
            continue
        indices.sort()
        min_diff = float('inf')
        for i in range(1, len(indices)):
            min_diff = min(min_diff, indices[i] - indices[i-1])
        count = 0
        for i in range(1, len(indices)):
            if indices[i] - indices[i-1] == min_diff:
                count += 1
        total += count
    return total
      
#include <unordered_map>
#include <vector>
#include <string>
#include <algorithm>
using namespace std;

int countMinDiffPairs(string s, int k) {
    unordered_map<string, vector<int>> substrIndices;
    int n = s.size();
    for (int i = 0; i <= n - k; ++i) {
        substrIndices[s.substr(i, k)].push_back(i);
    }
    int total = 0;
    for (auto &pair : substrIndices) {
        vector<int>& indices = pair.second;
        if (indices.size() < 2) continue;
        sort(indices.begin(), indices.end());
        int minDiff = INT_MAX;
        for (int i = 1; i < indices.size(); ++i) {
            minDiff = min(minDiff, indices[i] - indices[i-1]);
        }
        int count = 0;
        for (int i = 1; i < indices.size(); ++i) {
            if (indices[i] - indices[i-1] == minDiff) count++;
        }
        total += count;
    }
    return total;
}
      
import java.util.*;

public class Solution {
    public int countMinDiffPairs(String s, int k) {
        Map<String, List<Integer>> substrIndices = new HashMap<>();
        int n = s.length();
        for (int i = 0; i <= n - k; ++i) {
            String sub = s.substring(i, i + k);
            substrIndices.computeIfAbsent(sub, x -> new ArrayList<>()).add(i);
        }
        int total = 0;
        for (List<Integer> indices : substrIndices.values()) {
            if (indices.size() < 2) continue;
            Collections.sort(indices);
            int minDiff = Integer.MAX_VALUE;
            for (int i = 1; i < indices.size(); ++i) {
                minDiff = Math.min(minDiff, indices.get(i) - indices.get(i-1));
            }
            int count = 0;
            for (int i = 1; i < indices.size(); ++i) {
                if (indices.get(i) - indices.get(i-1) == minDiff) count++;
            }
            total += count;
        }
        return total;
    }
}
      
function countMinDiffPairs(s, k) {
    const substrIndices = {};
    const n = s.length;
    for (let i = 0; i <= n - k; ++i) {
        const sub = s.substring(i, i + k);
        if (!(sub in substrIndices)) substrIndices[sub] = [];
        substrIndices[sub].push(i);
    }
    let total = 0;
    for (const indices of Object.values(substrIndices)) {
        if (indices.length < 2) continue;
        indices.sort((a, b) => a - b);
        let minDiff = Infinity;
        for (let i = 1; i < indices.length; ++i) {
            minDiff = Math.min(minDiff, indices[i] - indices[i-1]);
        }
        let count = 0;
        for (let i = 1; i < indices.length; ++i) {
            if (indices[i] - indices[i-1] === minDiff) count++;
        }
        total += count;
    }
    return total;
}