Length of Shortest sub-array containing ‘k’ distinct elements of another Array | Sliding Window Problem
--
I came across this variant of the Sliding Window Problem during the coding round of a software engineering interview. I couldn’t find any references to it on the internet, hence this article. 😀
Problem Statement:
Given two arrays of integers (array A and array B) and a number (K), find the length of the shortest sub-array of array A which contains exactly K distinct elements of array B.
The problem statement might be a bit difficult to grasp at a single glance (and the original question used fancy words like stream of network data…), but the below details should make it clearer
Input Format:
The first line contains 3 space separated integers N and M and K, in this particular order. The next line contains N integers i.e. the elements of the array A, and the next line contains M integers denoting the elements of the array B.
Input Constraints:
1 ≤ K ≤ M ≤ N ≤ 2 x 10⁵
1 ≤ A[i], B[i] ≤ 10⁹
Sample Input:
5 3 2
1 1 1 2 1
1 2 3
Sample Output:2
Explanation:
In this case K = 2, so the required sub-array of array A should contain 2 distinct elements of array B. The shortest sub-array fulfilling our requirements would be sub array [4…5] i.e. {2, 1}
Solution:
On analyzing the problem statement, it’s evident that this is another instance of the sliding window problem, with the window in this case representing the sub-array of array A.
To read more about sliding window problems, you can refer to this medium article by outco.io
Let’s have a look at the solution and then go through it’s major points
- We’ve created a
set
and added all elements ofarrayB
in the set. We’ll use this set to determine if the number being considered is part ofarrayB
or not - The
map
will store numbers fromarrayA
, along with the indices at which the numbers occur. Every time we encounter a number again, we’ll put the number and the new index into themap
. Hence themap
will always store the index of the right-most occurrence of the number (that we’ve encountered so far) - We’ll only insert a number in the
map
if it is also part of theset
. Hence the size of themap
will give us the number of distinct elements in the sub-array, which are also part ofarrayB
- The variable
minLen
is initialized with the valueInteger.MAX_VALUE
and holds our interim result - The variables
start
andend
represent the starting index and ending index of the window - With every iteration of the
for
loop (line 10), we increment (slide) the end of the window, increasing the length of the sub-array by 1 - If the size of the map is less than
k
, we’ll continue to the next iteration of the loop - If the size of the map is greater than
k
, we’ll keep sliding the start of the window to remove elements which do not impact the size of themap
. We ensure this by using the following conditions — either the number at the start of the window is not part of theset
or it’s not the first occurrence of that number in our window (line 17) - Then we’ll remove the number at the
start
of the window and increment thestart
by 1. This will make the size of themap
equal tok
(Since at every step since we only slide theend
of the window by 1, the size of themap
will never exceedk
by more than 1) - Line 20 will repeat the same operation as line 17, sliding the start of the window to the right, without affecting the size of the
map
- We’ll update the interim result variable
minLen
by comparing it with the current length of the window - At the end of the for loop, we’ll return the result if it holds a valid value. Else we’ll return
-1
to indicate that no sub-array could be found which satisfies our requirements
Below is a runnable version of the same method, complete with input and output statements