The "Encode and Decode Strings" problem requires you to design an algorithm to encode a list of strings into a single string, and then decode that single string back into the original list of strings. The encoded string should be able to represent any list of strings, including those with empty strings or special characters. The two functions you need to implement are:
encode
: Takes a list of strings and returns a single encoded string.decode
: Takes the encoded string and returns the original list of strings.
At first glance, you might think of simply joining the strings with a separator (like a comma or a special character). However, this approach fails if any of the input strings themselves contain the separator, which would make decoding ambiguous.
To guarantee a unique and reversible encoding, we need a way to mark where each string starts and ends, even if the string contains any possible character. A common approach is to prefix each string with its length and a delimiter, so the decoder knows exactly how many characters to read next. This eliminates ambiguity and works for all cases, including empty strings or strings with special characters.
We use a length-prefix encoding scheme. Here’s how it works:
'#'
), and the string itself.'#'
.'#'
tells you how many characters to read next as a string.This approach is robust because the delimiter is never confused with string content, and the length prefix ensures we always know where each string ends.
Input: ["leet", "code", ""]
Encoding Process:
4#leet
4#code
0#
4#leet4#code0#
#
at index 1. The length is 4
.#
: "leet". Add to result.#
at index 6. The length is 4
.#
: "code". Add to result.#
at index 11. The length is 0
.["leet", "code", ""]
Brute-force approach:
The approach is linear and efficient in both time and space.
The key insight is to use a length-prefix encoding, which avoids ambiguity and handles all edge cases. By prefixing each string with its length and a delimiter, we ensure that decoding is straightforward and robust, even for empty strings or those containing special characters. This approach is simple, efficient, and elegant, making it a reliable way to encode and decode lists of strings.
class Codec:
def encode(self, strs):
# Encodes a list of strings to a single string.
return ''.join(f'{len(s)}#{s}' for s in strs)
def decode(self, s):
# Decodes a single string to a list of strings.
res = []
i = 0
while i < len(s):
j = i
while s[j] != '#':
j += 1
length = int(s[i:j])
res.append(s[j+1:j+1+length])
i = j + 1 + length
return res
class Codec {
public:
// Encodes a list of strings to a single string.
string encode(vector<string>& strs) {
string encoded;
for (const string& s : strs) {
encoded += to_string(s.length()) + "#" + s;
}
return encoded;
}
// Decodes a single string to a list of strings.
vector<string> decode(string s) {
vector<string> res;
int i = 0;
while (i < s.length()) {
int j = i;
while (s[j] != '#') j++;
int len = stoi(s.substr(i, j - i));
res.push_back(s.substr(j + 1, len));
i = j + 1 + len;
}
return res;
}
};
public class Codec {
// Encodes a list of strings to a single string.
public String encode(List<String> strs) {
StringBuilder sb = new StringBuilder();
for (String s : strs) {
sb.append(s.length()).append("#").append(s);
}
return sb.toString();
}
// Decodes a single string to a list of strings.
public List<String> decode(String s) {
List<String> res = new ArrayList<>();
int i = 0;
while (i < s.length()) {
int j = i;
while (s.charAt(j) != '#') {
j++;
}
int len = Integer.parseInt(s.substring(i, j));
res.add(s.substring(j + 1, j + 1 + len));
i = j + 1 + len;
}
return res;
}
}
var Codec = function() {};
Codec.prototype.encode = function(strs) {
// Encodes a list of strings to a single string.
return strs.map(s => s.length + '#' + s).join('');
};
Codec.prototype.decode = function(s) {
// Decodes a single string to a list of strings.
let res = [];
let i = 0;
while (i < s.length) {
let j = i;
while (s[j] !== '#') j++;
let len = parseInt(s.slice(i, j));
res.push(s.slice(j + 1, j + 1 + len));
i = j + 1 + len;
}
return res;
};