# Length of Shortest sub-array containing ‘k’ distinct elements of another Array | Sliding Window Problem

I came across this variant of the Sliding Window Problem during the coding round of a software engineering interview. I couldn’t find any references to it on the internet, hence this article. 😀

Problem Statement:
Given two arrays of integers (array A and array B) and a number (K), find the length of the shortest sub-array of array A which contains exactly K distinct elements of array B.

The problem statement might be a bit difficult to grasp at a single glance (and the original question used fancy words like stream of network data…), but the below details should make it clearer

Input Format:
The first line contains 3 space separated integers N and M and K, in this particular order. The next line contains N integers i.e. the elements of the array A, and the next line contains M integers denoting the elements of the array B.

Input Constraints:
1 ≤ KMN ≤ 2 x 10⁵
1 ≤ A[i], B[i] ≤ 10⁹

Sample Input:
`5 3 2 1 1 1 2 1 1 2 3`

Sample Output:
`2`

Explanation:
In this case K = 2, so the required sub-array of array A should contain 2 distinct elements of array B. The shortest sub-array fulfilling our requirements would be sub array [4…5] i.e. {2, 1}

Solution:
On analyzing the problem statement, it’s evident that this is another instance of the sliding window problem, with the window in this case representing the sub-array of array A.

To read more about sliding window problems, you can refer to this medium article by outco.io

Let’s have a look at the solution and then go through it’s major points

• We’ve created a `set` and added all elements of `arrayB` in the set. We’ll use this set to determine if the number being considered is part of `arrayB` or not
• The `map` will store numbers from `arrayA`, along with the indices at which the numbers occur. Every time we encounter a number again, we’ll put the number and the new index into the `map`. Hence the `map` will always store the index of the right-most occurrence of the number (that we’ve encountered so far)
• We’ll only insert a number in the `map` if it is also part of the `set`. Hence the size of the `map` will give us the number of distinct elements in the sub-array, which are also part of `arrayB`
• The variable `minLen` is initialized with the value `Integer.MAX_VALUE`and holds our interim result
• The variables `start` and `end` represent the starting index and ending index of the window
• With every iteration of the `for` loop (line 10), we increment (slide) the end of the window, increasing the length of the sub-array by 1
• If the size of the map is less than `k`, we’ll continue to the next iteration of the loop
• If the size of the map is greater than `k`, we’ll keep sliding the start of the window to remove elements which do not impact the size of the `map`. We ensure this by using the following conditions — either the number at the start of the window is not part of the `set` or it’s not the first occurrence of that number in our window (line 17)
• Then we’ll remove the number at the `start` of the window and increment the `start` by 1. This will make the size of the `map` equal to `k` (Since at every step since we only slide the `end` of the window by 1, the size of the `map` will never exceed `k` by more than 1)
• Line 20 will repeat the same operation as line 17, sliding the start of the window to the right, without affecting the size of the `map`
• We’ll update the interim result variable `minLen` by comparing it with the current length of the window
• At the end of the for loop, we’ll return the result if it holds a valid value. Else we’ll return `-1` to indicate that no sub-array could be found which satisfies our requirements

Below is a runnable version of the same method, complete with input and output statements

Part time Tech. Enthusiast! Full time Software Developer.

Part time Tech. Enthusiast! Full time Software Developer.