You are given a list of folder paths in the filesystem, represented as an array of strings folder
. Each string is a valid absolute path starting with the root directory '/'
.
Your task is to remove all sub-folders from the list, so that only the highest-level folders remain. A folder a
is considered a sub-folder of b
if a
starts with b + '/'
. You should return the list of folders after removing all sub-folders, in any order.
Constraints:
'/'
.At first glance, you might think of comparing every folder with every other folder to check if one is a sub-folder of another. However, this brute-force approach would be inefficient, especially for large inputs.
Instead, we notice that if we sort all the folder paths lexicographically, all sub-folders of a folder will appear immediately after it. For example, /a
will come before /a/b
, /a/b/c
, and so on. This means we can process the list in order and only keep a folder if it is not a sub-folder of the previously kept folder.
This insight allows us to avoid unnecessary comparisons and makes the solution both simpler and faster.
'/'
.
This approach is efficient because it only compares each folder with the most recently added folder, and the sorting ensures correctness.
Input: ["/a","/a/b","/c/d","/c/d/e","/c/f"]
["/a", "/a/b", "/c/d", "/c/d/e", "/c/f"]
result = []
"/a"
:
result
is empty, so add "/a"
."/a/b"
:
"/a/b"
start with "/a/"
? Yes. Skip."/c/d"
:
"/c/d"
start with "/a/"
? No. Add "/c/d"
."/c/d/e"
:
"/c/d/e"
start with "/c/d/"
? Yes. Skip."/c/f"
:
"/c/f"
start with "/c/d/"
? No. Add "/c/f"
.
Output: ["/a", "/c/d", "/c/f"]
Brute-force approach: For each folder, compare with every other folder to check if it's a sub-folder, leading to O(N^2 * L)
time, where N
is the number of folders and L
is the average length of the folder path.
Optimized approach:
O(N \log N)
time.O(N * L)
time.O(N \log N + N * L)
.O(N * L)
for storing the result and sorted list.The key to solving this problem efficiently is recognizing that sorting folder paths lexicographically brings sub-folders directly after their parent folders. By comparing each folder only with the most recently added top-level folder, we can efficiently filter out sub-folders in a single pass. This approach is both elegant and efficient, making use of simple string operations and sorting.
class Solution:
def removeSubfolders(self, folder):
folder.sort()
res = []
for f in folder:
if not res or not f.startswith(res[-1] + '/'):
res.append(f)
return res
class Solution {
public:
vector<string> removeSubfolders(vector<string>& folder) {
sort(folder.begin(), folder.end());
vector<string> res;
for (const string& f : folder) {
if (res.empty() || f.find(res.back() + "/") != 0) {
res.push_back(f);
}
}
return res;
}
};
import java.util.*;
class Solution {
public List<String> removeSubfolders(String[] folder) {
Arrays.sort(folder);
List<String> res = new ArrayList<>();
for (String f : folder) {
if (res.isEmpty() || !f.startsWith(res.get(res.size() - 1) + "/")) {
res.add(f);
}
}
return res;
}
}
var removeSubfolders = function(folder) {
folder.sort();
let res = [];
for (let f of folder) {
if (res.length === 0 || !f.startsWith(res[res.length - 1] + "/")) {
res.push(f);
}
}
return res;
};