Given an array arr
of strings, your task is to find the maximum possible length of a concatenated string with unique characters, formed by concatenating any subset of arr
. Each string in arr
consists of lowercase English letters. You may use each string at most once, and you may choose to use any number of strings (including none).
1 ≤ arr.length ≤ 16
; 1 ≤ arr[i].length ≤ 26
; arr[i]
contains only lowercase English letters.
At first glance, this problem seems to require checking all possible ways to combine the given strings and see which combinations yield the longest string with all unique characters. This is a classic subset or backtracking problem, since for each string, you can either include it or exclude it.
The brute-force approach would be to try all possible subsets of the input array, concatenate the strings in each subset, and check if the result has all unique characters. However, since the number of subsets grows exponentially with the number of strings (2^n), we need to consider if the problem's constraints allow this.
Upon closer inspection, since arr.length
is at most 16, it is feasible to check all subsets. Still, we can optimize by skipping subsets that are doomed to fail early—for example, if a string itself contains duplicate letters, it cannot be used in any solution. We can also use bitmasks to efficiently check for duplicate characters.
Let's break the solution down step-by-step:
This approach efficiently explores all valid combinations and avoids unnecessary work by pruning invalid paths early.
Let's walk through an example:
arr = ["un", "iq", "ue"]
We try all combinations:
The maximum length found is 6 ("unique").
class Solution:
def maxLength(self, arr):
def to_mask(s):
mask = 0
for c in s:
bit = 1 << (ord(c) - ord('a'))
if mask & bit:
return 0 # duplicate char in s
mask |= bit
return mask
masks = []
for s in arr:
m = to_mask(s)
if m:
masks.append((m, len(s)))
def dfs(idx, curr_mask, curr_len):
if idx == len(masks):
return curr_len
max_len = dfs(idx + 1, curr_mask, curr_len)
m, l = masks[idx]
if not (curr_mask & m):
max_len = max(max_len, dfs(idx + 1, curr_mask | m, curr_len + l))
return max_len
return dfs(0, 0, 0)
class Solution {
public:
int maxLength(vector<string>& arr) {
vector<pair<int, int>> masks;
for (const string& s : arr) {
int mask = 0;
bool valid = true;
for (char c : s) {
int bit = 1 << (c - 'a');
if (mask & bit) {
valid = false;
break;
}
mask |= bit;
}
if (valid) masks.push_back({mask, (int)s.size()});
}
return dfs(masks, 0, 0, 0);
}
private:
int dfs(const vector<pair<int, int>>& masks, int idx, int curr_mask, int curr_len) {
if (idx == masks.size()) return curr_len;
int max_len = dfs(masks, idx + 1, curr_mask, curr_len);
int m = masks[idx].first, l = masks[idx].second;
if (!(curr_mask & m))
max_len = max(max_len, dfs(masks, idx + 1, curr_mask | m, curr_len + l));
return max_len;
}
};
class Solution {
public int maxLength(List<String> arr) {
List<int[]> masks = new ArrayList<>();
for (String s : arr) {
int mask = 0;
boolean valid = true;
for (char c : s.toCharArray()) {
int bit = 1 << (c - 'a');
if ((mask & bit) != 0) {
valid = false;
break;
}
mask |= bit;
}
if (valid) masks.add(new int[]{mask, s.length()});
}
return dfs(masks, 0, 0, 0);
}
private int dfs(List<int[]> masks, int idx, int currMask, int currLen) {
if (idx == masks.size()) return currLen;
int maxLen = dfs(masks, idx + 1, currMask, currLen);
int m = masks.get(idx)[0], l = masks.get(idx)[1];
if ((currMask & m) == 0)
maxLen = Math.max(maxLen, dfs(masks, idx + 1, currMask | m, currLen + l));
return maxLen;
}
}
var maxLength = function(arr) {
function toMask(s) {
let mask = 0;
for (let i = 0; i < s.length; i++) {
let bit = 1 << (s.charCodeAt(i) - 97);
if ((mask & bit) !== 0) return 0; // duplicate char
mask |= bit;
}
return mask;
}
let masks = [];
for (let s of arr) {
let m = toMask(s);
if (m) masks.push([m, s.length]);
}
function dfs(idx, currMask, currLen) {
if (idx === masks.length) return currLen;
let maxLen = dfs(idx + 1, currMask, currLen);
let [m, l] = masks[idx];
if ((currMask & m) === 0)
maxLen = Math.max(maxLen, dfs(idx + 1, currMask | m, currLen + l));
return maxLen;
}
return dfs(0, 0, 0);
};
Brute-force approach: Tries all subsets (2^n), and for each subset, checks if the concatenated string has unique characters (O(total string length)), so total time is O(2^n * L), where L is the sum of string lengths.
Optimized approach (bitmask + backtracking): Still explores up to 2^n subsets, but checks for overlapping characters in O(1) time using bitmasks. So, time complexity is O(2^n). Space complexity is O(n) for the recursion stack and O(n) for storing masks.
The constraints (n ≤ 16) keep this feasible.
The key to this problem is representing sets of unique characters as bitmasks and using backtracking to efficiently explore all subsets, pruning paths that would violate the uniqueness constraint. This approach is elegant because it leverages bitwise operations for speed and clarity, and it is practical due to the small input size. The solution demonstrates how a classic combinatorial problem can be made efficient with the right representation and early pruning.