Want Help Cracking FAANG?

(Then click this)

×
Back to Question Bank

271. Encode and Decode Strings - Leetcode Solution

Problem Description

The "Encode and Decode Strings" problem requires you to design an algorithm to encode a list of strings into a single string, and then decode that single string back into the original list of strings. The encoded string should be able to represent any list of strings, including those with empty strings or special characters. The two functions you need to implement are:

  • encode: Takes a list of strings and returns a single encoded string.
  • decode: Takes the encoded string and returns the original list of strings.
Constraints:
  • The encoding must be unambiguous: no two different lists of strings should encode to the same string.
  • The solution must handle empty strings and strings containing any characters.
  • There is only one valid way to encode and decode; do not use built-in serialization libraries.

Thought Process

At first glance, you might think of simply joining the strings with a separator (like a comma or a special character). However, this approach fails if any of the input strings themselves contain the separator, which would make decoding ambiguous.

To guarantee a unique and reversible encoding, we need a way to mark where each string starts and ends, even if the string contains any possible character. A common approach is to prefix each string with its length and a delimiter, so the decoder knows exactly how many characters to read next. This eliminates ambiguity and works for all cases, including empty strings or strings with special characters.

Solution Approach

We use a length-prefix encoding scheme. Here’s how it works:

  1. Encoding:
    • For each string in the list, compute its length.
    • Concatenate the length, a special delimiter (like '#'), and the string itself.
    • Combine all such encoded pieces into one final string.
  2. Decoding:
    • Iterate through the encoded string, looking for the delimiter '#'.
    • The number before '#' tells you how many characters to read next as a string.
    • Repeat until the end of the encoded string, collecting all decoded strings into a list.

This approach is robust because the delimiter is never confused with string content, and the length prefix ensures we always know where each string ends.

Example Walkthrough

Input: ["leet", "code", ""]
Encoding Process:

  • "leet" has length 4: encode as 4#leet
  • "code" has length 4: encode as 4#code
  • "" (empty string) has length 0: encode as 0#
  • Final encoded string: 4#leet4#code0#
Decoding Process:
  • Start at position 0, find first # at index 1. The length is 4.
  • Read 4 characters after #: "leet". Add to result.
  • Move to next section, find next # at index 6. The length is 4.
  • Read 4 characters after #: "code". Add to result.
  • Move to next section, find # at index 11. The length is 0.
  • Read 0 characters: "". Add to result.
  • End of string. Decoded list: ["leet", "code", ""]

Time and Space Complexity

Brute-force approach:

  • If you tried joining strings with a separator and then splitting, decoding could be O(N) in time but would fail for edge cases (strings containing the separator).
Optimized approach (length-prefix):
  • Encoding: O(N), where N is the total number of characters in all strings. We process each character once.
  • Decoding: O(N), as we scan through the encoded string once, parsing lengths and extracting substrings.
  • Space: O(N) for both encoding and decoding, as we store the encoded string and the decoded list.

The approach is linear and efficient in both time and space.

Summary

The key insight is to use a length-prefix encoding, which avoids ambiguity and handles all edge cases. By prefixing each string with its length and a delimiter, we ensure that decoding is straightforward and robust, even for empty strings or those containing special characters. This approach is simple, efficient, and elegant, making it a reliable way to encode and decode lists of strings.

Code Implementation

class Codec:
    def encode(self, strs):
        # Encodes a list of strings to a single string.
        return ''.join(f'{len(s)}#{s}' for s in strs)

    def decode(self, s):
        # Decodes a single string to a list of strings.
        res = []
        i = 0
        while i < len(s):
            j = i
            while s[j] != '#':
                j += 1
            length = int(s[i:j])
            res.append(s[j+1:j+1+length])
            i = j + 1 + length
        return res
      
class Codec {
public:
    // Encodes a list of strings to a single string.
    string encode(vector<string>& strs) {
        string encoded;
        for (const string& s : strs) {
            encoded += to_string(s.length()) + "#" + s;
        }
        return encoded;
    }

    // Decodes a single string to a list of strings.
    vector<string> decode(string s) {
        vector<string> res;
        int i = 0;
        while (i < s.length()) {
            int j = i;
            while (s[j] != '#') j++;
            int len = stoi(s.substr(i, j - i));
            res.push_back(s.substr(j + 1, len));
            i = j + 1 + len;
        }
        return res;
    }
};
      
public class Codec {
    // Encodes a list of strings to a single string.
    public String encode(List<String> strs) {
        StringBuilder sb = new StringBuilder();
        for (String s : strs) {
            sb.append(s.length()).append("#").append(s);
        }
        return sb.toString();
    }

    // Decodes a single string to a list of strings.
    public List<String> decode(String s) {
        List<String> res = new ArrayList<>();
        int i = 0;
        while (i < s.length()) {
            int j = i;
            while (s.charAt(j) != '#') {
                j++;
            }
            int len = Integer.parseInt(s.substring(i, j));
            res.add(s.substring(j + 1, j + 1 + len));
            i = j + 1 + len;
        }
        return res;
    }
}
      
var Codec = function() {};

Codec.prototype.encode = function(strs) {
    // Encodes a list of strings to a single string.
    return strs.map(s => s.length + '#' + s).join('');
};

Codec.prototype.decode = function(s) {
    // Decodes a single string to a list of strings.
    let res = [];
    let i = 0;
    while (i < s.length) {
        let j = i;
        while (s[j] !== '#') j++;
        let len = parseInt(s.slice(i, j));
        res.push(s.slice(j + 1, j + 1 + len));
        i = j + 1 + len;
    }
    return res;
};