Wednesday, January 20, 2021

Find the duplicated number in an Array.

 Find the duplicate numbers in an Array



Today I was asked an interesting question during an interview. The problem is that find the duplicated number in an array where the array consists of numbers from 1 to n plus a duplicated number.

One example, given [1,2,3,2], the algorithm should return 2.

Solution 1 Hashmap: 

It is pretty easy just to use a hashmap to store the frequency for each number and return the number with frequency as 2.
With python it is just one line:

a = [1,2,3,2]
c = Counter(a)
print([k for k,v in c.items() if v == 2][0])
view raw duplicated 1.py hosted with ❤ by GitHub
The time complexity and space complexity are both O(n).

Solution 2 Sorting:

If we want to optimize the space complexity, we can sort the array first, and iterate the array and return the element which is the same as its predecessor. The code looks like this:

def findDuplicate(array):
array.sort()
for i, n in enumerate(array):
if i > 0 and array[i] == array[i - 1]:
return array[i]
return None
view raw duplicated 3.py hosted with ❤ by GitHub
The time complexity is O(N log N) and space complexity is O(1)

Solution 3 Cycle sort:

Cycle sort is a very efficient algorithm if the element of the input array is known before. The idea is to put number i into the position ith + 1 in the array, so we can sort the array with O(n) time complexity.
To make it easy to understand, we can allocate a new array with the same size as the input array, and move each element of the input array to its correct position in the new array. For example, 1 should go to index 0, 2 should go to index 1, etc..
The code looks like this:
def findDuplicated(array):
newArray = [0] * len(array):
for i in array:
if newArray[i - 1] != 0:
return i
else:
newArray[i - 1] = i
return None
view raw Cycle sort.py hosted with ❤ by GitHub
Based on the idea, we can further improve the space complexity by doing in-place cycle sort, if we find a number is already in place, we do nothing, otherwise, we put them to the correct place by swapping. If the desired place already has a correct number, this means we find a duplicated number.
public int findDuplicated(int[] array) {
for (int i = 0; i < array.length;) {
if (array[i] == i + 1) { // i is already in position
i++;
continue;
} else if (array[i] == array[array[i] - 1]) { // find duplicated number
return array[i];
} else { // put i to the ith-1 position
swap(array, i, array[i] - 1);
}
}
return array[array.length - 1];
}
private void swap(int[] array, int i, int j) {
int t = array[i];
array[i] = array[j];
array[j] = t;
}

The final solution gives us O(n) time complexity and O(1) space complexity. If you know a better solution, feel free to leave a comment.

Labels: