467. Unique Substrings in Wraparound String - Leetcode Solution

Problem Description

Given a string p, return the number of unique non-empty substrings of p that are present in the infinite wraparound string of lowercase English letters, which is:
...zabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcd...
In other words, the wraparound string is formed by infinitely repeating the alphabet in order, wrapping from z back to a.

A substring of p is considered valid if it appears as a substring in this infinite wraparound string. For example, abc, zab, and yza are all valid.

Constraints:

1 ≤ p.length ≤ 10⁵
p consists only of lowercase English letters.

Thought Process

At first glance, you might think to enumerate all possible substrings of p and check if each one is a valid substring in the infinite wraparound string. However, this would be extremely inefficient since the number of substrings is O(n²).

Upon closer inspection, we notice that a substring is valid if its letters are consecutive in the alphabet, considering the wraparound from z to a. For example, cde and xyzab are both valid.

The key insight is that for each position in p, we only need to track the length of the longest valid consecutive substring ending with each character. This is because any shorter substring ending at the same character is already included as a substring of the longer one.

This shift from brute-force enumeration to tracking maximal substrings per character is the main optimization.

Solution Approach

Let's break down the optimized solution step by step:

Initialize a Hash Map or Array:
- Create a map (or array of size 26) maxLen to record the maximum length of a valid substring ending with each character c ('a' to 'z').
Iterate Over the String:
- For each character in p, determine if it continues a valid consecutive substring from the previous character (i.e., the current character follows the previous one in the alphabet, with wraparound).
- If yes, increment the current valid substring length (currLen); otherwise, reset currLen to 1.
Update the Maximal Lengths:
- For the current character, update maxLen[char] to be the maximum of its current value and currLen.
Sum Up the Results:
- The answer is the sum of all values in maxLen. This is because for a character c with maximal length L, there are L unique substrings ending at c (of length 1 to L).

This approach ensures we count each unique valid substring exactly once, and the process is efficient.

Example Walkthrough

Let's walk through an example with p = "zab":

First character 'z':
- It's the start, so currLen = 1.
- Update maxLen['z'] = 1.
Second character 'a':
- Check if 'a' follows 'z' in wraparound: Yes!
- currLen = 2.
- Update maxLen['a'] = 2 (since 2 > 0).
Third character 'b':
- 'b' follows 'a': Yes!
- currLen = 3.
- Update maxLen['b'] = 3.

Now, sum all values: maxLen['z'] = 1, maxLen['a'] = 2, maxLen['b'] = 3, others are 0.
Total = 1 + 2 + 3 = 6.

The 6 unique substrings are: 'z', 'a', 'b', 'za', 'ab', 'zab'.

Code Implementation

class Solution:
    def findSubstringInWraproundString(self, p: str) -> int:
        maxLen = [0] * 26
        currLen = 0
        for i in range(len(p)):
            if i > 0 and (ord(p[i]) - ord(p[i-1]) == 1 or (p[i-1] == 'z' and p[i] == 'a')):
                currLen += 1
            else:
                currLen = 1
            idx = ord(p[i]) - ord('a')
            maxLen[idx] = max(maxLen[idx], currLen)
        return sum(maxLen)

class Solution {
public:
    int findSubstringInWraproundString(string p) {
        vector<int> maxLen(26, 0);
        int currLen = 0;
        for (int i = 0; i < p.size(); ++i) {
            if (i > 0 && ((p[i] - p[i-1] == 1) || (p[i-1] == 'z' && p[i] == 'a')))
                currLen++;
            else
                currLen = 1;
            int idx = p[i] - 'a';
            maxLen[idx] = max(maxLen[idx], currLen);
        }
        int result = 0;
        for (int len : maxLen) result += len;
        return result;
    }
};

class Solution {
    public int findSubstringInWraproundString(String p) {
        int[] maxLen = new int[26];
        int currLen = 0;
        for (int i = 0; i < p.length(); i++) {
            if (i > 0 && ((p.charAt(i) - p.charAt(i-1) == 1) || (p.charAt(i-1) == 'z' && p.charAt(i) == 'a'))) {
                currLen++;
            } else {
                currLen = 1;
            }
            int idx = p.charAt(i) - 'a';
            maxLen[idx] = Math.max(maxLen[idx], currLen);
        }
        int result = 0;
        for (int len : maxLen) result += len;
        return result;
    }
}

var findSubstringInWraproundString = function(p) {
    let maxLen = new Array(26).fill(0);
    let currLen = 0;
    for (let i = 0; i < p.length; i++) {
        if (i > 0 && ((p.charCodeAt(i) - p.charCodeAt(i-1) === 1) || (p[i-1] === 'z' && p[i] === 'a'))) {
            currLen++;
        } else {
            currLen = 1;
        }
        let idx = p.charCodeAt(i) - 'a'.charCodeAt(0);
        maxLen[idx] = Math.max(maxLen[idx], currLen);
    }
    return maxLen.reduce((a, b) => a + b, 0);
};

Time and Space Complexity

Brute-force approach:
- Enumerating all substrings is O(n²) in time and space (for a string of length n), which is not feasible for large n.
Optimized approach (above):
- Time Complexity: O(n), since we traverse the string once and all updates to maxLen are O(1).
- Space Complexity: O(1), since maxLen is a fixed-size array of 26 elements (for the alphabet).

This efficiency is possible because we avoid storing all substrings and instead only track the maximal length per character.

Summary

The key to solving the "Unique Substrings in Wraparound String" problem efficiently is recognizing that we only need to track the longest valid substring ending at each character. This avoids the need to enumerate every substring and leverages the structure of the wraparound alphabet. By using a simple array and a single pass through the string, we achieve an elegant, optimal O(n) solution that counts all unique valid substrings exactly once.

AlgoMap.io

Want Help Cracking FAANG?