The "Find the Duplicate Number" problem is a classic algorithmic challenge. You are given an array nums
containing n + 1
integers, where each integer is in the range [1, n]
(inclusive). There is only one repeated number in nums
, but it could appear more than once. Your task is to find and return the duplicate number.
Example:
Input: nums = [1,3,4,2,2]
Output: 2
At first glance, the problem seems to invite a brute-force solution: compare every pair of elements and return the duplicate. However, this approach is too slow (O(n2)), especially for large arrays.
Next, you might consider using a hash set to track seen numbers, but the problem requires constant extra space, so this is not allowed.
The key to solving this efficiently is to realize that the array's properties are similar to a linked list with a cycle: since there are n + 1
numbers in the range [1, n]
, at least one number must repeat (by the pigeonhole principle). This insight allows us to use Floyd's Tortoise and Hare (Cycle Detection) algorithm, which finds cycles in linked lists using two pointers and constant space.
Here’s how we can use Floyd’s Cycle Detection algorithm to find the duplicate number:
nums
as a “pointer” to the next index (i.e., at index i
, the next index is nums[i]
).slow
and fast
at the start of the array (e.g., slow = nums[0]
, fast = nums[nums[0]]
).slow
by one step and fast
by two steps until they meet. This guarantees they meet inside the cycle.slow
to the start of the array. Move both slow
and fast
one step at a time. The point where they meet again is the duplicate number.Why This Works: The cycle's entry point corresponds to the duplicate number because that's where two "pointers" (indices) point to the same value.
Let's walk through the solution with the input nums = [3,1,3,4,2]
.
Result: The duplicate number is 3
.
class Solution:
def findDuplicate(self, nums):
# Phase 1: Find the intersection point
slow = nums[0]
fast = nums[0]
while True:
slow = nums[slow]
fast = nums[nums[fast]]
if slow == fast:
break
# Phase 2: Find the entrance to the cycle
slow = nums[0]
while slow != fast:
slow = nums[slow]
fast = nums[fast]
return slow
class Solution {
public:
int findDuplicate(vector<int>& nums) {
int slow = nums[0];
int fast = nums[0];
// Phase 1
do {
slow = nums[slow];
fast = nums[nums[fast]];
} while (slow != fast);
// Phase 2
slow = nums[0];
while (slow != fast) {
slow = nums[slow];
fast = nums[fast];
}
return slow;
}
};
class Solution {
public int findDuplicate(int[] nums) {
int slow = nums[0];
int fast = nums[0];
// Phase 1
do {
slow = nums[slow];
fast = nums[nums[fast]];
} while (slow != fast);
// Phase 2
slow = nums[0];
while (slow != fast) {
slow = nums[slow];
fast = nums[fast];
}
return slow;
}
}
var findDuplicate = function(nums) {
let slow = nums[0];
let fast = nums[0];
// Phase 1
do {
slow = nums[slow];
fast = nums[nums[fast]];
} while (slow !== fast);
// Phase 2
slow = nums[0];
while (slow !== fast) {
slow = nums[slow];
fast = nums[fast];
}
return slow;
};
The "Find the Duplicate Number" problem is elegantly solved using Floyd’s Tortoise and Hare cycle detection algorithm. By treating the array as a linked list where each value points to the next index, we exploit the duplicate to detect a cycle and find its entrance — the duplicate number. This method is efficient, requiring only O(n) time and O(1) space, and demonstrates how classic algorithms can be applied to array problems when direct approaches are not allowed by the constraints.