class Solution:
def minimumLengthEncoding(self, words):
words = set(words)
for word in list(words):
for k in range(1, len(word)):
words.discard(word[k:])
return sum(len(word) + 1 for word in words)
class Solution {
public:
int minimumLengthEncoding(vector<string>& words) {
unordered_set<string> s(words.begin(), words.end());
for (const string& word : words) {
for (int k = 1; k < word.size(); ++k) {
s.erase(word.substr(k));
}
}
int res = 0;
for (const string& word : s) {
res += word.size() + 1;
}
return res;
}
};
class Solution {
public int minimumLengthEncoding(String[] words) {
Set<String> s = new HashSet<>(Arrays.asList(words));
for (String word : words) {
for (int k = 1; k < word.length(); ++k) {
s.remove(word.substring(k));
}
}
int res = 0;
for (String word : s) {
res += word.length() + 1;
}
return res;
}
}
var minimumLengthEncoding = function(words) {
let s = new Set(words);
for (let word of words) {
for (let k = 1; k < word.length; ++k) {
s.delete(word.slice(k));
}
}
let res = 0;
for (let word of s) {
res += word.length + 1;
}
return res;
};
Given a list of words
, we want to encode them as a single string in such a way that each word is a suffix of the string, and every word ends with a '#'
character. The goal is to minimize the length of this encoded string. If a word is already a suffix of another word, it does not need to be encoded separately. The output is the minimum possible length of such an encoding.
'#'
.
At first glance, it may seem that we need to encode every word separately, simply joining them together with '#'
characters. However, the key insight is that if a word is a suffix of another word, it does not need to be encoded on its own. For example, if "time" and "me" are both in the list, encoding "time#" already covers "me" as a suffix.
A brute-force approach might try to check for all possible pairs of words to see if one is a suffix of another, but this is inefficient. Instead, we can optimize by focusing only on the words that are not suffixes of any other word. Only these "root" words need to be encoded, as all their suffixes are automatically included.
'#'
at the end).This approach is efficient because:
Consider the input words = ["time", "me", "bell"]
.
Both "me" and "e" are suffixes of "time", so they do not need to be encoded separately.
The shortest encoding of words problem is elegantly solved by removing all words that are suffixes of any other word, and only encoding the remaining words. The use of a set makes suffix removal efficient, and the final answer is simply the sum of the lengths of these root words plus one for each '#'
. This approach avoids unnecessary computations and leverages the properties of sets for optimal performance.