737. Sentence Similarity II - Leetcode Solution

Code Implementation

class Solution:
    def areSentencesSimilarTwo(self, words1, words2, pairs):
        if len(words1) != len(words2):
            return False

        from collections import defaultdict

        parent = {}

        def find(x):
            if x != parent.setdefault(x, x):
                parent[x] = find(parent[x])
            return parent[x]

        def union(x, y):
            parent[find(x)] = find(y)

        for a, b in pairs:
            union(a, b)

        for w1, w2 in zip(words1, words2):
            if w1 == w2:
                continue
            if find(w1) != find(w2):
                return False
        return True

class Solution {
public:
    unordered_map<string, string> parent;

    string find(string x) {
        if (parent.find(x) == parent.end()) parent[x] = x;
        if (parent[x] != x) parent[x] = find(parent[x]);
        return parent[x];
    }

    void unite(string x, string y) {
        parent[find(x)] = find(y);
    }

    bool areSentencesSimilarTwo(vector<string>& words1, vector<string>& words2, vector<vector<string>>& pairs) {
        if (words1.size() != words2.size()) return false;
        for (auto& p : pairs) {
            unite(p[0], p[1]);
        }
        for (int i = 0; i < words1.size(); ++i) {
            if (words1[i] == words2[i]) continue;
            if (find(words1[i]) != find(words2[i])) return false;
        }
        return true;
    }
};

class Solution {
    Map<String, String> parent = new HashMap<>();

    private String find(String x) {
        if (!parent.containsKey(x)) parent.put(x, x);
        if (!parent.get(x).equals(x)) parent.put(x, find(parent.get(x)));
        return parent.get(x);
    }

    private void union(String x, String y) {
        parent.put(find(x), find(y));
    }

    public boolean areSentencesSimilarTwo(String[] words1, String[] words2, List<List<String>> pairs) {
        if (words1.length != words2.length) return false;
        for (List<String> p : pairs) {
            union(p.get(0), p.get(1));
        }
        for (int i = 0; i < words1.length; ++i) {
            if (words1[i].equals(words2[i])) continue;
            if (!find(words1[i]).equals(find(words2[i]))) return false;
        }
        return true;
    }
}

var areSentencesSimilarTwo = function(words1, words2, pairs) {
    if (words1.length !== words2.length) return false;
    const parent = {};

    function find(x) {
        if (!(x in parent)) parent[x] = x;
        if (parent[x] !== x) parent[x] = find(parent[x]);
        return parent[x];
    }

    function union(x, y) {
        parent[find(x)] = find(y);
    }

    for (let [a, b] of pairs) {
        union(a, b);
    }

    for (let i = 0; i < words1.length; i++) {
        if (words1[i] === words2[i]) continue;
        if (find(words1[i]) !== find(words2[i])) return false;
    }
    return true;
};

Problem Description

Given two sentences represented as string arrays words1 and words2, determine if they are similar. Two sentences are similar if each pair of corresponding words are either the same, or they are similar according to a given list of word pairs pairs. Similarity is transitive: if "a" is similar to "b", and "b" is similar to "c", then "a" is similar to "c". Also, similarity is symmetric. Each word pair in pairs represents a direct similarity.

Key Constraints:

The two sentences (words1 and words2) must be of the same length to be considered similar.
Each pair in pairs is a two-element list of strings.
There may be multiple levels of similarity (i.e., indirect similarity via chains).
There may be only one valid answer; do not assume all words are similar.

Thought Process

At first glance, it seems we could just check each word pair in words1 and words2 against the list of pairs, but this only works for direct similarity. The challenge is the transitive property: "a" similar to "b" and "b" similar to "c" means "a" is similar to "c". This forms a graph where words are connected if they are similar, and we want to see if two words are in the same connected component.

A brute-force approach would be, for each word pair, to search through all possible paths between the two words in the similarity graph (using BFS or DFS). But this would be inefficient, especially if the graph is large.

To optimize, we can model the problem as a union-find (disjoint set) problem. By grouping all connected words together, we can quickly check if two words are in the same group (i.e., similar, directly or indirectly) in nearly constant time.

Solution Approach

To efficiently determine if two words are similar (directly or transitively), we'll use the Union-Find (Disjoint Set Union, DSU) data structure. Here’s how the approach works step by step:

Build the Union-Find Structure:
- Initialize a parent map to keep track of each word’s group leader (its “parent”).
- For every pair (a, b) in pairs, union them together so they belong to the same group.
Union and Find Operations:
- Find(x): Follows parent pointers until it reaches the root representative of x’s group.
- Union(x, y): Finds the roots of x and y, and merges them by pointing one root to the other.
- Path compression is used in find to flatten the structure for faster future queries.
Check Each Word Pair:
- For each position i, if words1[i] and words2[i] are the same, continue.
- Otherwise, check if they have the same root in the union-find structure. If not, return false.
Edge Case:
- If the lengths of words1 and words2 differ, immediately return false.

This approach is efficient because union-find operations are nearly constant time with path compression.

Example Walkthrough

Example Input:

words1 = ["great", "acting", "skills"]
words2 = ["fine", "drama", "talent"]
pairs = [["great", "good"], ["fine", "good"], ["acting","drama"], ["skills","talent"]]

Step-by-Step:

Build the union-find structure:
- Union "great" and "good" → group: {"great", "good"}
- Union "fine" and "good" → group: {"great", "good", "fine"}
- Union "acting" and "drama" → group: {"acting", "drama"}
- Union "skills" and "talent" → group: {"skills", "talent"}
Check each word pair:
- "great" vs "fine": Both are in the same group (via "good"), so similar.
- "acting" vs "drama": Both are in the same group, so similar.
- "skills" vs "talent": Both are in the same group, so similar.
All pairs pass, so return True.

Time and Space Complexity

Brute-Force Approach:

For each word pair, a BFS/DFS could take O(P) time (where P is the number of pairs), leading to O(N*P) for N word pairs—potentially very slow.
Space: O(P) for the graph.

Optimized (Union-Find) Approach:

Union-Find operations with path compression are nearly O(1) per operation (amortized), so total time is O(P + N), where P is the number of pairs and N is the number of words.
Space: O(U), where U is the number of unique words in pairs and sentences (for the parent map).

Summary

The key insight is to treat word similarities as a graph and use the union-find data structure to efficiently group words that are transitively similar. This allows us to check, in nearly constant time, whether two words are in the same similarity group. The approach is elegant because it converts a potentially expensive search problem into a series of quick lookups, making it suitable even for large inputs.

AlgoMap.io

Want Help Cracking FAANG?