811. Subdomain Visit Count - Leetcode Solution

Problem Description

You are given a list of strings called cpdomains, where each string has the format "count domain". Here, count is the number of visits to the domain. For every domain in the list, a visit to that domain also counts as a visit to all of its subdomains. A subdomain is defined as any suffix of the domain that starts at a dot ('.') boundary.

Your task is to return a list of strings representing the number of visits to each subdomain, in the format "count subdomain". The order of the output does not matter.

Constraints:

Each domain string is valid and contains only lowercase letters, digits, hyphens, and dots.
There is at least one domain in cpdomains.

For example, given cpdomains = ["9001 discuss.leetcode.com"], the output should be ["9001 discuss.leetcode.com", "9001 leetcode.com", "9001 com"].

Thought Process

The problem asks us to count visits for all possible subdomains of each domain in the input. At first glance, we might think to iterate through each domain and manually tally up the visits for each possible subdomain. However, if we do this naively for every domain and subdomain, we may end up repeating a lot of work.

Instead, we can use a hash map (dictionary) to keep a running total of visits for each subdomain as we process the input. For each entry, we can split the domain into its possible subdomains and increment the count for each one. This approach avoids redundant work and is efficient.

The key insight is that each domain string can be split at each dot to generate all its subdomains, and we can process each in a single pass.

Solution Approach

Here is a step-by-step breakdown of the solution:

Initialize a hash map: Use a dictionary (or map) to store subdomain visit counts. The keys will be subdomain strings, and the values will be their respective visit totals.
Process each input string:
- Split each string into the visit count and the domain name.
- Convert the count to an integer.
Generate all subdomains:
- Split the domain by the dot character ('.').
- For each possible suffix (from each position to the end), join the parts to form a subdomain.
- For example, discuss.leetcode.com yields:
  - discuss.leetcode.com
  - leetcode.com
  - com
Update counts: For each subdomain, add the visit count to its entry in the hash map.
Format the output: Convert the hash map entries into the required output format: "count subdomain".

This approach ensures that each subdomain's count is accumulated correctly and efficiently.

Example Walkthrough

Let's walk through the example cpdomains = ["9001 discuss.leetcode.com"].

Split the string into count = 9001 and domain = "discuss.leetcode.com".
Generate subdomains:
- Start with the full domain: discuss.leetcode.com
- Remove the first label: leetcode.com
- Remove the next label: com
Add 9001 to each subdomain in the hash map:
- discuss.leetcode.com: 9001
- leetcode.com: 9001
- com: 9001
Final output: ["9001 discuss.leetcode.com", "9001 leetcode.com", "9001 com"]

If there were multiple entries, counts for repeated subdomains would be summed.

Code Implementation

from typing import List
class Solution:
    def subdomainVisits(self, cpdomains: List[str]) -> List[str]:
        counts = {}
        for entry in cpdomains:
            count_str, domain = entry.split()
            count = int(count_str)
            fragments = domain.split('.')
            for i in range(len(fragments)):
                subdomain = '.'.join(fragments[i:])
                counts[subdomain] = counts.get(subdomain, 0) + count
        return [f"{cnt} {dom}" for dom, cnt in counts.items()]

#include <vector>
#include <string>
#include <unordered_map>
using namespace std;

class Solution {
public:
    vector<string> subdomainVisits(vector<string>& cpdomains) {
        unordered_map<string, int> counts;
        for (const string& entry : cpdomains) {
            int space = entry.find(' ');
            int count = stoi(entry.substr(0, space));
            string domain = entry.substr(space + 1);
            for (size_t i = 0; i <= domain.size(); ++i) {
                if (i == 0 || domain[i - 1] == '.') {
                    string subdomain = domain.substr(i);
                    counts[subdomain] += count;
                }
            }
        }
        vector<string> result;
        for (const auto& kv : counts) {
            result.push_back(to_string(kv.second) + " " + kv.first);
        }
        return result;
    }
};

import java.util.*;

class Solution {
    public List<String> subdomainVisits(String[] cpdomains) {
        Map<String, Integer> counts = new HashMap<>();
        for (String entry : cpdomains) {
            String[] parts = entry.split(" ");
            int count = Integer.parseInt(parts[0]);
            String domain = parts[1];
            String[] fragments = domain.split("\\.");
            for (int i = 0; i < fragments.length; i++) {
                String subdomain = String.join(".", Arrays.copyOfRange(fragments, i, fragments.length));
                counts.put(subdomain, counts.getOrDefault(subdomain, 0) + count);
            }
        }
        List<String> result = new ArrayList<>();
        for (Map.Entry<String, Integer> kv : counts.entrySet()) {
            result.add(kv.getValue() + " " + kv.getKey());
        }
        return result;
    }
}

/**
 * @param {string[]} cpdomains
 * @return {string[]}
 */
var subdomainVisits = function(cpdomains) {
    const counts = {};
    for (let entry of cpdomains) {
        let [countStr, domain] = entry.split(' ');
        let count = parseInt(countStr);
        let fragments = domain.split('.');
        for (let i = 0; i < fragments.length; i++) {
            let subdomain = fragments.slice(i).join('.');
            counts[subdomain] = (counts[subdomain] || 0) + count;
        }
    }
    return Object.entries(counts).map(([dom, cnt]) => `${cnt} ${dom}`);
};

Time and Space Complexity

Brute-force approach: If we were to generate all possible subdomains for all possible substrings, it would be unnecessarily repetitive and inefficient, potentially O(N * L^2), where N is the number of domains and L is the average length.

Optimized approach (hash map):

Time Complexity: O(N * L), where N is the number of entries in cpdomains and L is the average number of labels (parts separated by '.') in a domain. For each domain, we generate at most L subdomains.
Space Complexity: O(M), where M is the total number of unique subdomains encountered. This space is used by the hash map.

This is efficient because each subdomain is processed only once per appearance.

Summary

This problem demonstrates how to efficiently aggregate counts for hierarchical data using a hash map. By splitting each domain into all possible subdomains and updating a running total, we can solve the problem in a single pass. The key insight is to use the structure of the domain names to generate subdomains systematically, and a hash map to accumulate results without redundant work. This approach is both simple and efficient, making it suitable for large inputs.

AlgoMap.io

Want Help Cracking FAANG?