1722. Minimize Hamming Distance After Swap Operations - Leetcode Solution

Problem Description

You are given three things:

An integer array source
An integer array target of the same length as source
An array allowedSwaps where each element is a pair of indices [i, j] indicating you are allowed to swap source[i] with source[j] any number of times

Your goal is to perform any number of swaps (using only the allowed pairs) to rearrange source such that the Hamming distance between source and target is minimized. The Hamming distance is the number of positions i where source[i] != target[i].

Constraints:

Each swap can be used any number of times
Each index can be swapped only with those it's allowed to (as per allowedSwaps)
There may be cycles or disconnected groups in allowedSwaps
Elements can be reused within their connected component, but not across components

The task is to return the minimum possible Hamming distance after performing any number of allowed swaps.

Thought Process

At first glance, one might consider trying all permutations of source using allowed swaps, but this quickly becomes infeasible for large arrays due to the factorial number of possibilities.

However, a key insight is that swaps are only allowed between certain indices, forming groups of indices that can be freely permuted among themselves. If two indices are connected (directly or indirectly) via allowed swaps, their values can be rearranged in any order within that group.

Therefore, the problem can be broken down into:

Finding connected components of indices (using allowed swaps)
Within each component, rearranging the source values to best match the target values at those same indices

This reduces the problem from a global permutation to a set of smaller, independent matching problems.

Solution Approach

Let's break down the solution step by step:

Find Connected Components:
- Use Union-Find (Disjoint Set Union) to group indices that can reach each other via allowed swaps.
- After processing all allowedSwaps, each group represents a set of indices whose values can be rearranged freely among themselves.
Group Indices by Component:
- For each component, collect the indices that belong to it.
Count Values in Each Component:
- For each component, count the occurrences of each value in source and target at those indices (using hash maps or arrays).
Calculate Minimum Hamming Distance for Each Component:
- For each value in the component, the maximum number of matches is the minimum of its count in source and in target.
- The number of mismatches is the total number of indices in the component minus the number of matches.
Sum Up Across All Components:
- Add up the mismatches from all components to get the final answer.

Why Union-Find? Union-Find is efficient for grouping items into connected components and supports fast merging and finding of groups.
Why Hash Maps? Hash maps allow us to count occurrences of each value efficiently, which is needed to compute the best possible matchings.

Example Walkthrough

Sample Input:
source = [1,2,3,4]
target = [2,1,4,5]
allowedSwaps = [[0,1],[2,3]]

Step 1: Find Connected Components
- [0,1] means indices 0 and 1 are connected.
- [2,3] means indices 2 and 3 are connected.
- So, there are two components: {0,1} and {2,3}.

Step 2: For Each Component, Compare Source and Target
- For {0,1}:

source = [1,2]
target = [2,1]
We can rearrange source to match target exactly (swap 1 and 2), so 0 mismatches here.

- For {2,3}:

source = [3,4]
target = [4,5]
We can match 4 with 4, but 3 does not have a match for 5. So, 1 mismatch here.

Final Answer: 0 (from first component) + 1 (from second component) = 1

Why? Because within each connected group, we can rearrange the values to minimize mismatches, but cannot swap values between groups.

Time and Space Complexity

Brute-force Approach:
- Try all permutations of source within allowed swaps: This is factorial time in the size of each component, which is infeasible.

Optimized Approach:
- Union-Find: Each union and find operation is almost O(1) (amortized), so for N elements and M swaps: O(N + M).
- Grouping and Counting: For each index, we process it a constant number of times: O(N).
- Total: O(N + M), where N is length of source and M is number of allowed swaps.
- Space Complexity: O(N) for Union-Find, O(N) for hash maps.

This is efficient and scalable for large arrays.

Summary

The key insight is to decompose the array into connected groups based on allowed swaps, then optimally match values within each group to minimize mismatches. By using Union-Find to identify these groups and hash maps to count and match values, we reduce a potentially exponential problem to linear time. This approach is elegant because it leverages the structure of the problem and avoids unnecessary computation.

Code Implementation

from collections import defaultdict, Counter

class UnionFind:
    def __init__(self, n):
        self.parent = list(range(n))
    def find(self, x):
        while self.parent[x] != x:
            self.parent[x] = self.parent[self.parent[x]]
            x = self.parent[x]
        return x
    def union(self, x, y):
        self.parent[self.find(x)] = self.find(y)

class Solution:
    def minimumHammingDistance(self, source, target, allowedSwaps):
        n = len(source)
        uf = UnionFind(n)
        for i, j in allowedSwaps:
            uf.union(i, j)
        groups = defaultdict(list)
        for i in range(n):
            groups[uf.find(i)].append(i)
        res = 0
        for indices in groups.values():
            src_counter = Counter([source[i] for i in indices])
            tgt_counter = Counter([target[i] for i in indices])
            # Number of matches is sum of min counts for each value
            matches = 0
            for val in src_counter:
                matches += min(src_counter[val], tgt_counter.get(val, 0))
            res += len(indices) - matches
        return res

#include <vector>
#include <unordered_map>
#include <unordered_set>
#include <algorithm>
using namespace std;

class UnionFind {
public:
    vector<int> parent;
    UnionFind(int n) : parent(n) {
        for (int i = 0; i < n; ++i) parent[i] = i;
    }
    int find(int x) {
        if (parent[x] != x) parent[x] = find(parent[x]);
        return parent[x];
    }
    void unite(int x, int y) {
        parent[find(x)] = find(y);
    }
};

class Solution {
public:
    int minimumHammingDistance(vector<int>& source, vector<int>& target, vector<vector<int>>& allowedSwaps) {
        int n = source.size();
        UnionFind uf(n);
        for (auto& swap : allowedSwaps) {
            uf.unite(swap[0], swap[1]);
        }
        unordered_map<int, vector<int>> groups;
        for (int i = 0; i < n; ++i) {
            groups[uf.find(i)].push_back(i);
        }
        int res = 0;
        for (auto& [root, indices] : groups) {
            unordered_map<int, int> src_count, tgt_count;
            for (int idx : indices) {
                src_count[source[idx]]++;
                tgt_count[target[idx]]++;
            }
            int matches = 0;
            for (auto& [val, cnt] : src_count) {
                matches += min(cnt, tgt_count[val]);
            }
            res += indices.size() - matches;
        }
        return res;
    }
};

import java.util.*;

class UnionFind {
    int[] parent;
    public UnionFind(int n) {
        parent = new int[n];
        for (int i = 0; i < n; ++i) parent[i] = i;
    }
    public int find(int x) {
        if (parent[x] != x) parent[x] = find(parent[x]);
        return parent[x];
    }
    public void union(int x, int y) {
        parent[find(x)] = find(y);
    }
}

class Solution {
    public int minimumHammingDistance(int[] source, int[] target, int[][] allowedSwaps) {
        int n = source.length;
        UnionFind uf = new UnionFind(n);
        for (int[] swap : allowedSwaps) {
            uf.union(swap[0], swap[1]);
        }
        Map<Integer, List<Integer>> groups = new HashMap<>();
        for (int i = 0; i < n; ++i) {
            int root = uf.find(i);
            groups.computeIfAbsent(root, k -> new ArrayList<>()).add(i);
        }
        int res = 0;
        for (List<Integer> indices : groups.values()) {
            Map<Integer, Integer> srcCount = new HashMap<>();
            Map<Integer, Integer> tgtCount = new HashMap<>();
            for (int idx : indices) {
                srcCount.put(source[idx], srcCount.getOrDefault(source[idx], 0) + 1);
                tgtCount.put(target[idx], tgtCount.getOrDefault(target[idx], 0) + 1);
            }
            int matches = 0;
            for (int val : srcCount.keySet()) {
                matches += Math.min(srcCount.get(val), tgtCount.getOrDefault(val, 0));
            }
            res += indices.size() - matches;
        }
        return res;
    }
}

class UnionFind {
    constructor(n) {
        this.parent = Array(n).fill(0).map((_, i) => i);
    }
    find(x) {
        if (this.parent[x] !== x) this.parent[x] = this.find(this.parent[x]);
        return this.parent[x];
    }
    union(x, y) {
        this.parent[this.find(x)] = this.find(y);
    }
}

var minimumHammingDistance = function(source, target, allowedSwaps) {
    const n = source.length;
    const uf = new UnionFind(n);
    for (const [i, j] of allowedSwaps) {
        uf.union(i, j);
    }
    const groups = {};
    for (let i = 0; i < n; ++i) {
        const root = uf.find(i);
        if (!(root in groups)) groups[root] = [];
        groups[root].push(i);
    }
    let res = 0;
    for (const indices of Object.values(groups)) {
        const srcCount = new Map();
        const tgtCount = new Map();
        for (const idx of indices) {
            srcCount.set(source[idx], (srcCount.get(source[idx]) || 0) + 1);
            tgtCount.set(target[idx], (tgtCount.get(target[idx]) || 0) + 1);
        }
        let matches = 0;
        for (const [val, cnt] of srcCount.entries()) {
            matches += Math.min(cnt, tgtCount.get(val) || 0);
        }
        res += indices.length - matches;
    }
    return res;
};

AlgoMap.io

Want Help Cracking FAANG?