Want Help Cracking FAANG?

(Then click this)

×
Back to Question Bank

1519. Number of Nodes in the Sub-Tree With the Same Label - Leetcode Solution

Problem Description

The "Number of Nodes in the Sub-Tree With the Same Label" problem asks you to process a tree (an undirected, connected acyclic graph) with n nodes, where each node is labeled with a lowercase English letter. The tree is described by a list of edges and a string labels of length n, where labels[i] is the label of the i-th node.

For each node, you must compute how many nodes in its sub-tree (including itself) have the same label as the node. The result should be an array ans of length n, where ans[i] is the number of nodes in the sub-tree rooted at node i that have the same label as node i.

Constraints:

  • The tree is rooted at node 0.
  • There is exactly one valid tree (no cycles, all nodes connected).
  • Input sizes: 1 <= n <= 10^5.
  • Input: edges (list of pairs), labels (string).

Thought Process

At first glance, the problem seems to require, for each node, traversing its entire sub-tree and counting matching labels. A brute-force approach would be to, for each node, perform a traversal (like BFS or DFS) and count the labels, resulting in a very high time complexity (O(n^2)).

However, since the graph is a tree and all nodes are connected, we can optimize by using a post-order Depth-First Search (DFS). By processing children before their parent, we can aggregate label counts from the bottom up, avoiding redundant traversals.

The key insight is to compute, for each node, the count of each label in its sub-tree, and store these counts as we traverse up the tree. This way, each node's answer can be computed in constant time after processing its children.

Solution Approach

We solve the problem efficiently using a recursive DFS with post-order traversal. Here’s a step-by-step approach:

  1. Build the Tree: Use an adjacency list to represent the tree from the edges input. This allows fast lookup of children for each node.
  2. Define DFS: Write a recursive DFS function that, for each node, returns a count of each label in its sub-tree.
  3. Aggregate Label Counts: For each child of the current node, recursively call DFS and merge the returned label counts into the current node’s counter.
  4. Update Current Node: After processing all children, increment the count for the current node’s label. Store the count of the node’s label in the answer array for this node.
  5. Prevent Revisiting Parent: To avoid cycles (even though the input is a tree), always pass the parent node in the DFS call and skip it when traversing children.
  6. Return the Answer: After DFS completes, return the answer array.

We use a counter (array of size 26 for each lowercase letter) to keep track of label frequencies efficiently. This approach ensures that each node and edge is visited only once.

Example Walkthrough

Input:

  • n = 7
  • edges = [[0,1],[0,2],[1,4],[1,5],[2,3],[2,6]]
  • labels = "abaedcd"

Step-by-step:

  1. Build the adjacency list:
    • 0: [1,2]
    • 1: [0,4,5]
    • 2: [0,3,6]
    • 3: [2]
    • 4: [1]
    • 5: [1]
    • 6: [2]
  2. Start DFS at node 0 (label 'a'):
    • Recurse into child 1 (label 'b'):
      • Child 4 (label 'e') is a leaf: returns [0,...,1,...,0]
      • Child 5 (label 'd') is a leaf: returns [0,...,1,...,0]
      • Aggregate counts, increment for 'b', update answer for node 1.
    • Recurse into child 2 (label 'a'):
      • Child 3 (label 'e') is a leaf: returns [0,...,1,...,0]
      • Child 6 (label 'd') is a leaf: returns [0,...,1,...,0]
      • Aggregate counts, increment for 'a', update answer for node 2.
    • Aggregate counts at node 0, increment for 'a', update answer for node 0.
  3. Final answer: [2,1,1,1,1,1,1]

Each value represents the number of nodes in the sub-tree rooted at that node with the same label.

Time and Space Complexity

Brute-force Approach:

  • For each node, traverse its sub-tree (could be O(n) per node).
  • Total: O(n^2) time, which is infeasible for large n.
Optimized DFS Approach:
  • Each node and edge is visited once: O(n) time.
  • Each DFS call merges counters of size 26 (constant): O(26n) = O(n) time.
  • Space: O(n) for the tree and answer arrays, plus O(h) call stack (h = tree height).

The optimized approach is efficient and suitable for large inputs.

Summary

The problem is elegantly solved by recognizing the value of post-order DFS for aggregating information in trees. By passing up label counts from the leaves to the root, we avoid redundant traversals and achieve linear time complexity. The use of a fixed-size array for label counts and careful tree traversal ensures both speed and clarity, making this a classic example of tree dynamic programming and aggregation.

Code Implementation

from collections import defaultdict

class Solution:
    def countSubTrees(self, n, edges, labels):
        tree = defaultdict(list)
        for u, v in edges:
            tree[u].append(v)
            tree[v].append(u)
        ans = [0] * n

        def dfs(node, parent):
            count = [0] * 26
            label_idx = ord(labels[node]) - ord('a')
            count[label_idx] = 1
            for child in tree[node]:
                if child == parent:
                    continue
                child_count = dfs(child, node)
                for i in range(26):
                    count[i] += child_count[i]
            ans[node] = count[label_idx]
            return count

        dfs(0, -1)
        return ans
      
#include <vector>
#include <string>
using namespace std;

class Solution {
public:
    vector<int> countSubTrees(int n, vector<vector<int>>& edges, string labels) {
        vector<vector<int>> tree(n);
        for (auto& e : edges) {
            tree[e[0]].push_back(e[1]);
            tree[e[1]].push_back(e[0]);
        }
        vector<int> ans(n);

        function<vector<int>(int, int)> dfs = [&](int node, int parent) {
            vector<int> count(26, 0);
            int idx = labels[node] - 'a';
            count[idx] = 1;
            for (int child : tree[node]) {
                if (child == parent) continue;
                vector<int> child_count = dfs(child, node);
                for (int i = 0; i < 26; ++i)
                    count[i] += child_count[i];
            }
            ans[node] = count[idx];
            return count;
        };
        dfs(0, -1);
        return ans;
    }
};
      
import java.util.*;

class Solution {
    public int[] countSubTrees(int n, int[][] edges, String labels) {
        List<List<Integer>> tree = new ArrayList<>();
        for (int i = 0; i < n; ++i) tree.add(new ArrayList<>());
        for (int[] e : edges) {
            tree.get(e[0]).add(e[1]);
            tree.get(e[1]).add(e[0]);
        }
        int[] ans = new int[n];

        dfs(0, -1, tree, labels, ans);
        return ans;
    }

    private int[] dfs(int node, int parent, List<List<Integer>> tree, String labels, int[] ans) {
        int[] count = new int[26];
        int idx = labels.charAt(node) - 'a';
        count[idx] = 1;
        for (int child : tree.get(node)) {
            if (child == parent) continue;
            int[] childCount = dfs(child, node, tree, labels, ans);
            for (int i = 0; i < 26; ++i)
                count[i] += childCount[i];
        }
        ans[node] = count[idx];
        return count;
    }
}
      
var countSubTrees = function(n, edges, labels) {
    const tree = Array.from({length: n}, () => []);
    for (const [u, v] of edges) {
        tree[u].push(v);
        tree[v].push(u);
    }
    const ans = new Array(n).fill(0);

    function dfs(node, parent) {
        const count = new Array(26).fill(0);
        const idx = labels.charCodeAt(node) - 97;
        count[idx] = 1;
        for (const child of tree[node]) {
            if (child === parent) continue;
            const childCount = dfs(child, node);
            for (let i = 0; i < 26; ++i)
                count[i] += childCount[i];
        }
        ans[node] = count[idx];
        return count;
    }

    dfs(0, -1);
    return ans;
};