Want Help Cracking FAANG?

(Then click this)

×
Back to Question Bank

471. Encode String with Shortest Length - Leetcode Solution

Problem Description

Given a non-empty string s, encode it such that its encoded length is as short as possible. The encoding rule is: k[encoded_string], where the encoded_string inside the square brackets is repeated exactly k times. If encoding does not make the string shorter, return the original string.

For example, "aaa" can be encoded as "3[a]" but since "3[a]" is not shorter than "aaa", we keep the original. However, "aaaaa" can be encoded as "5[a]".

  • Input: a string s (1 ≤ s.length ≤ 160)
  • Output: the shortest encoded string possible using the above rule

The encoding must be valid and use the format k[encoded_string] for repeated substrings, and you may use encoding recursively within encoded_string.

Thought Process

At first glance, we might think of simply finding repeated substrings and compressing them. However, the challenge is that repetitions can be nested, and sometimes encoding a substring recursively can result in a shorter overall encoding. Brute-force approaches, such as trying every possible substring and encoding, would be too slow for large strings.

To optimize, we can use dynamic programming. The idea is to build up solutions for shorter substrings and use them to solve for longer substrings. For every substring, we try all possible splits and all possible repeated patterns, and choose the shortest encoding at each step.

This approach is similar to solving the "word break" problem or optimal matrix chain multiplication, where we build up from small to large segments.

Solution Approach

  • Dynamic Programming Table: We define dp[i][j] as the shortest encoded string for substring s[i:j+1].
  • Base Case: For substrings of length 1, the encoding is the character itself.
  • Splitting: For each substring s[i:j+1], try every split point k between i and j and set dp[i][j] as the minimum of dp[i][k] + dp[k+1][j].
  • Pattern Search: For each substring, check if it can be represented as multiple repeats of a smaller substring. If so, encode it as repeat_count[encoded_pattern] where encoded_pattern is the shortest encoding of that smaller substring.
  • Choosing Minimum: For each substring, select the shortest among the original, all splits, and all encodings.
  • Return: The answer is dp[0][n-1] (the shortest encoding for the whole string).

The DP approach ensures that all possible encodings are considered, and by always selecting the shortest, we guarantee the optimal solution.

Example Walkthrough

Let's consider the input s = "ababab".

  1. For substring "ab", there's no shorter encoding, so dp[0][1] = "ab".
  2. For substring "abab" (i=0, j=3), we can split at k=1 ("ab" + "ab"), but also notice that "abab" is 2 * "ab". So, 2[ab] is shorter than "abab".
  3. For substring "ababab" (i=0, j=5), we can split at k=3 ("abab" + "ab"), or notice that "ababab" is 3 * "ab". So, 3[ab] is the shortest encoding.

The final answer for "ababab" is "3[ab]".

Time and Space Complexity

  • Brute-force: For every substring, try all possible encodings, leading to exponential time complexity in the worst case.
  • Optimized DP: There are O(n^2) substrings. For each, we try all possible splits (O(n)) and check for repeated patterns (O(n)), resulting in O(n^3) time complexity.
  • Space Complexity: The DP table uses O(n^2) space.

For n = 160, this approach is efficient enough.

Summary

This problem requires minimizing the encoded length of a string using a specific repeat encoding format. By using dynamic programming, we efficiently explore all possible encodings for all substrings, always selecting the shortest. The solution is elegant because it systematically builds up from simple cases, avoids redundant calculations, and guarantees optimality by always choosing the minimal encoding at every step.

Code Implementation

class Solution:
    def encode(self, s: str) -> str:
        n = len(s)
        dp = [["" for _ in range(n)] for _ in range(n)]
        for l in range(1, n+1):  # length of substring
            for i in range(n - l + 1):
                j = i + l - 1
                substr = s[i:j+1]
                dp[i][j] = substr
                # Try all possible splits
                for k in range(i, j):
                    if len(dp[i][k] + dp[k+1][j]) < len(dp[i][j]):
                        dp[i][j] = dp[i][k] + dp[k+1][j]
                # Try to encode by repeating pattern
                for k in range(1, l//2 + 1):
                    if l % k == 0:
                        repeat = substr[:k]
                        if repeat * (l // k) == substr:
                            encoded = str(l // k) + "[" + dp[i][i+k-1] + "]"
                            if len(encoded) < len(dp[i][j]):
                                dp[i][j] = encoded
        return dp[0][n-1]
      
class Solution {
public:
    string encode(string s) {
        int n = s.size();
        vector<vector<string>> dp(n, vector<string>(n, ""));
        for (int l = 1; l <= n; ++l) {
            for (int i = 0; i + l - 1 < n; ++i) {
                int j = i + l - 1;
                string substr = s.substr(i, l);
                dp[i][j] = substr;
                // Try all possible splits
                for (int k = i; k < j; ++k) {
                    if (dp[i][k].size() + dp[k+1][j].size() < dp[i][j].size()) {
                        dp[i][j] = dp[i][k] + dp[k+1][j];
                    }
                }
                // Try to encode by repeating pattern
                for (int k = 1; k <= l/2; ++k) {
                    if (l % k == 0) {
                        string repeat = substr.substr(0, k);
                        string repeated = "";
                        for (int t = 0; t < l/k; ++t) repeated += repeat;
                        if (repeated == substr) {
                            string encoded = to_string(l/k) + "[" + dp[i][i+k-1] + "]";
                            if (encoded.size() < dp[i][j].size()) {
                                dp[i][j] = encoded;
                            }
                        }
                    }
                }
            }
        }
        return dp[0][n-1];
    }
};
      
class Solution {
    public String encode(String s) {
        int n = s.length();
        String[][] dp = new String[n][n];
        for (int l = 1; l <= n; l++) {
            for (int i = 0; i + l - 1 < n; i++) {
                int j = i + l - 1;
                String substr = s.substring(i, j + 1);
                dp[i][j] = substr;
                // Try all possible splits
                for (int k = i; k < j; k++) {
                    if (dp[i][k].length() + dp[k+1][j].length() < dp[i][j].length()) {
                        dp[i][j] = dp[i][k] + dp[k+1][j];
                    }
                }
                // Try to encode by repeating pattern
                for (int k = 1; k <= l/2; k++) {
                    if (l % k == 0) {
                        String repeat = substr.substring(0, k);
                        StringBuilder repeated = new StringBuilder();
                        for (int t = 0; t < l / k; t++) repeated.append(repeat);
                        if (repeated.toString().equals(substr)) {
                            String encoded = (l / k) + "[" + dp[i][i + k - 1] + "]";
                            if (encoded.length() < dp[i][j].length()) {
                                dp[i][j] = encoded;
                            }
                        }
                    }
                }
            }
        }
        return dp[0][n - 1];
    }
}
      
var encode = function(s) {
    const n = s.length;
    const dp = Array.from({length: n}, () => Array(n).fill(""));
    for (let l = 1; l <= n; l++) {
        for (let i = 0; i + l - 1 < n; i++) {
            let j = i + l - 1;
            let substr = s.substring(i, j + 1);
            dp[i][j] = substr;
            // Try all possible splits
            for (let k = i; k < j; k++) {
                if ((dp[i][k] + dp[k+1][j]).length < dp[i][j].length) {
                    dp[i][j] = dp[i][k] + dp[k+1][j];
                }
            }
            // Try to encode by repeating pattern
            for (let k = 1; k <= Math.floor(l/2); k++) {
                if (l % k === 0) {
                    let repeat = substr.substring(0, k);
                    if (repeat.repeat(l / k) === substr) {
                        let encoded = (l / k) + "[" + dp[i][i + k - 1] + "]";
                        if (encoded.length < dp[i][j].length) {
                            dp[i][j] = encoded;
                        }
                    }
                }
            }
        }
    }
    return dp[0][n-1];
};