Find the duplicated number in an Array.
Find the duplicate numbers in an Array
Today I was asked an interesting question during an interview. The problem is that find the duplicated number in an array where the array consists of numbers from 1 to n plus a duplicated number.
One example, given [1,2,3,2], the algorithm should return 2.
Solution 1 Hashmap:
It is pretty easy just to use a hashmap to store the frequency for each number and return the number with frequency as 2.
With python it is just one line:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
a = [1,2,3,2] | |
c = Counter(a) | |
print([k for k,v in c.items() if v == 2][0]) |
The time complexity and space complexity are both O(n).
Solution 2 Sorting:
If we want to optimize the space complexity, we can sort the array first, and iterate the array and return the element which is the same as its predecessor. The code looks like this:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
def findDuplicate(array): | |
array.sort() | |
for i, n in enumerate(array): | |
if i > 0 and array[i] == array[i - 1]: | |
return array[i] | |
return None |
The time complexity is O(N log N) and space complexity is O(1)
Solution 3 Cycle sort:
Cycle sort is a very efficient algorithm if the element of the input array is known before. The idea is to put number i into the position ith + 1 in the array, so we can sort the array with O(n) time complexity.
To make it easy to understand, we can allocate a new array with the same size as the input array, and move each element of the input array to its correct position in the new array. For example, 1 should go to index 0, 2 should go to index 1, etc..
The code looks like this:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
def findDuplicated(array): | |
newArray = [0] * len(array): | |
for i in array: | |
if newArray[i - 1] != 0: | |
return i | |
else: | |
newArray[i - 1] = i | |
return None |
Based on the idea, we can further improve the space complexity by doing in-place cycle sort, if we find a number is already in place, we do nothing,
otherwise, we put them to the correct place by swapping. If the desired place already has a correct number, this means we find a duplicated number.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
public int findDuplicated(int[] array) { | |
for (int i = 0; i < array.length;) { | |
if (array[i] == i + 1) { // i is already in position | |
i++; | |
continue; | |
} else if (array[i] == array[array[i] - 1]) { // find duplicated number | |
return array[i]; | |
} else { // put i to the ith-1 position | |
swap(array, i, array[i] - 1); | |
} | |
} | |
return array[array.length - 1]; | |
} | |
private void swap(int[] array, int i, int j) { | |
int t = array[i]; | |
array[i] = array[j]; | |
array[j] = t; | |
} |
The final solution gives us O(n) time complexity and O(1) space complexity. If you know a better solution, feel free to leave a comment.
Labels: Algorithm