Want Help Cracking FAANG?

(Then click this)

×
Back to Question Bank

1520. Maximum Number of Non-Overlapping Substrings - Leetcode Solution

Code Implementation

class Solution:
    def maxNumOfSubstrings(self, s: str) -> list:
        n = len(s)
        first = [n] * 26
        last = [-1] * 26
        for i, c in enumerate(s):
            idx = ord(c) - ord('a')
            first[idx] = min(first[idx], i)
            last[idx] = max(last[idx], i)
        intervals = []
        for i in range(26):
            if first[i] <= last[i]:
                l, r = first[i], last[i]
                j = l
                while j <= r:
                    idx2 = ord(s[j]) - ord('a')
                    l = min(l, first[idx2])
                    r = max(r, last[idx2])
                    j += 1
                intervals.append((l, r))
        intervals.sort(key=lambda x: x[1])
        res = []
        end = -1
        for l, r in intervals:
            if l > end:
                res.append(s[l:r+1])
                end = r
        return res
      
class Solution {
public:
    vector<string> maxNumOfSubstrings(string s) {
        int n = s.size();
        vector<int> first(26, n), last(26, -1);
        for (int i = 0; i < n; ++i) {
            int idx = s[i] - 'a';
            first[idx] = min(first[idx], i);
            last[idx] = max(last[idx], i);
        }
        vector<pair<int, int>> intervals;
        for (int i = 0; i < 26; ++i) {
            if (first[i] <= last[i]) {
                int l = first[i], r = last[i];
                int j = l;
                while (j <= r) {
                    int idx2 = s[j] - 'a';
                    l = min(l, first[idx2]);
                    r = max(r, last[idx2]);
                    ++j;
                }
                intervals.push_back({l, r});
            }
        }
        sort(intervals.begin(), intervals.end(), [](auto& a, auto& b){ return a.second < b.second; });
        vector<string> res;
        int end = -1;
        for (auto& p : intervals) {
            if (p.first > end) {
                res.push_back(s.substr(p.first, p.second - p.first + 1));
                end = p.second;
            }
        }
        return res;
    }
};
      
class Solution {
    public List<String> maxNumOfSubstrings(String s) {
        int n = s.length();
        int[] first = new int[26];
        int[] last = new int[26];
        Arrays.fill(first, n);
        Arrays.fill(last, -1);
        for (int i = 0; i < n; i++) {
            int idx = s.charAt(i) - 'a';
            first[idx] = Math.min(first[idx], i);
            last[idx] = Math.max(last[idx], i);
        }
        List<int[]> intervals = new ArrayList<>();
        for (int i = 0; i < 26; i++) {
            if (first[i] <= last[i]) {
                int l = first[i], r = last[i];
                int j = l;
                while (j <= r) {
                    int idx2 = s.charAt(j) - 'a';
                    l = Math.min(l, first[idx2]);
                    r = Math.max(r, last[idx2]);
                    j++;
                }
                intervals.add(new int[]{l, r});
            }
        }
        intervals.sort(Comparator.comparingInt(a -> a[1]));
        List<String> res = new ArrayList<>();
        int end = -1;
        for (int[] p : intervals) {
            if (p[0] > end) {
                res.add(s.substring(p[0], p[1] + 1));
                end = p[1];
            }
        }
        return res;
    }
}
      
var maxNumOfSubstrings = function(s) {
    const n = s.length;
    const first = Array(26).fill(n);
    const last = Array(26).fill(-1);
    for (let i = 0; i < n; i++) {
        let idx = s.charCodeAt(i) - 97;
        first[idx] = Math.min(first[idx], i);
        last[idx] = Math.max(last[idx], i);
    }
    let intervals = [];
    for (let i = 0; i < 26; i++) {
        if (first[i] <= last[i]) {
            let l = first[i], r = last[i], j = l;
            while (j <= r) {
                let idx2 = s.charCodeAt(j) - 97;
                l = Math.min(l, first[idx2]);
                r = Math.max(r, last[idx2]);
                j++;
            }
            intervals.push([l, r]);
        }
    }
    intervals.sort((a, b) => a[1] - b[1]);
    let res = [];
    let end = -1;
    for (let [l, r] of intervals) {
        if (l > end) {
            res.push(s.substring(l, r + 1));
            end = r;
        }
    }
    return res;
};
      

Problem Description

The "Maximum Number of Non-Overlapping Substrings" problem asks you to find the largest set of non-overlapping substrings from a given string s, such that each substring contains all occurrences of every character it contains.

More precisely:
  • Each substring must cover all appearances of every character inside it. For example, if a substring contains the letter 'a', it must include every occurrence of 'a' that appears between the first and last 'a' in s.
  • The substrings must not overlap (no shared indices).
  • You must return the maximum number of such substrings (there may be multiple valid answers, but any one is acceptable).
  • Each substring should be as short as possible, i.e., you cannot shrink any substring and still satisfy the requirement.
Constraints:
  • 1 <= s.length <= 1000
  • s consists only of lowercase English letters.

Thought Process

To solve this problem, let's first break down what is being asked:
  • We need to find substrings that "capture" all appearances of the characters they contain.
  • We want as many such substrings as possible, and they cannot overlap.

A brute-force approach would be to consider all possible substrings and check if they satisfy the requirements, but this would be too slow for large strings.

Instead, let's think about each character individually:
  • For each character, we can find its first and last occurrence in s.
  • The minimal substring that "captures" a character is from its first to its last occurrence, but if there are other characters in between whose own first/last occurrences extend beyond this window, we must expand the window to include them as well.
  • After creating all such minimal windows, we can try to pick as many non-overlapping ones as possible, similar to the "Interval Scheduling" problem.
The challenge is to efficiently expand the intervals and select the maximum set of non-overlapping intervals.

Solution Approach

Here is a step-by-step guide to solving the problem:
  1. Find First and Last Occurrences:
    • For each character ('a' to 'z'), record its first and last index in s.
  2. Build Minimal Valid Intervals:
    • For each character, try to build the smallest substring that contains all of its occurrences and, recursively, all occurrences of any other character within that range.
    • This is done by starting with the character's first and last index, then for every character inside that window, expand the window to include their own first and last appearances, repeating until the window doesn't change.
  3. Collect All Valid Intervals:
    • Store all unique intervals found in the previous step.
  4. Sort Intervals by End Index:
    • Sort the intervals by their end index (right endpoint). This helps in selecting the maximum number of non-overlapping intervals.
  5. Greedy Selection of Non-Overlapping Intervals:
    • Iterate through the sorted intervals, always picking the next interval that starts after the end of the previous chosen interval.
  6. Return the Substrings:
    • For each selected interval, extract the corresponding substring from s and return the list.
This approach guarantees maximum non-overlapping substrings, each as small as possible and containing all occurrences of their characters.

Example Walkthrough

Example Input: s = "adefaddaccc"

Step-by-step:
  1. First and last for each char:
    • 'a': 0 to 7
    • 'd': 1 to 6
    • 'e': 2 to 2
    • 'f': 3 to 3
    • 'c': 8 to 10
  2. Build minimal intervals:
    • For 'a': start at [0,7]. Inside this window, we see 'd' at 1 (last at 6), so window is [0,7]. No further expansion.
    • For 'e': [2,2] (no expansion needed).
    • For 'f': [3,3] (no expansion needed).
    • For 'c': [8,10] (no expansion needed).
  3. Intervals: [0,7], [2,2], [3,3], [8,10]
  4. Sort by end: [2,2], [3,3], [0,7], [8,10]
  5. Greedily pick:
    • Pick [2,2]: covers 'e'
    • Next, [3,3]: doesn't overlap, covers 'f'
    • [0,7]: overlaps with previous, skip
    • [8,10]: doesn't overlap, covers 'c'
  6. Result substrings: "e", "f", "ccc"
Output: ["e","f","ccc"] (order may vary)

Time and Space Complexity

  • Brute-force: O(N3) or worse, since checking all substrings and their validity is very slow.
  • Optimized Approach:
    • Finding first/last index for each character: O(N)
    • Expanding intervals for each character: O(26 * N) at worst, since there are at most 26 letters.
    • Sorting intervals: O(26 log 26) = O(1)
    • Greedy selection: O(26)
    • Total Time: O(N)
    • Space Complexity: O(N) for storing intervals and results.

Summary

The solution finds, for every character, the minimal substring that contains all its occurrences and those of any other character inside it. By collecting these intervals and greedily picking the maximum set of non-overlapping intervals, we ensure the answer is both valid and maximal. The approach is efficient, elegant, and leverages interval scheduling concepts for optimal performance.