Count the number of sub-arrays with at least k pairs of indices (i, j) such that i < j and arr[i] == arr[j] in a given array.

Given an integer array arr and an integer k, count the number of good sub-arrays of arr. A sub-array arr is good if there are at least k pairs of indices [ i, j ] such that i < j and arr[i] == arr[j].

Examples :

Input: arr = [3,1,4,3,2,2,4], k = 2 Output: 4 Explanation: The output is 4 as there are 4 different good sub-arrays present inside the array arr which have at least 2 pairs whose elements are equal.

  • [3,1,4,3,2,2] has 2 such pairs. They are ( 3, 3 ) and ( 2, 2 ) at indices [0,3] and [4,5] respectively.

  • [3,1,4,3,2,2,4] has 3 such pairs . They are ( 3, 3 ), ( 2, 2 ), and ( 4, 4 ) at indices [0,3], 4,5] and [2, 6] respectively.

  • [1,4,3,2,2,4] has 2 such pairs . They are ( 4, 4) and ( 2, 2 ) at indices [2,6] and [4,5] respectively.

  • [4,3,2,2,4] has 2 such pairs. They are ( 4,4 ) and ( 2, 2 ) at indices [2,6] and [4,5] respectively.

Input: arr = [2,2,2], k = 1 Output: 3 Explanation: The output is 3 as there are 3 different sub-arrays present inside the array arr which have at least 1 pair whose elements are equal

  • [2,2]. It is the sub-array present between indices [ 0, 1 ]. It has 1 pair (2,2) such that 0 < 1 and arr[0] == arr[1]

  • [2,2]. It is the sub-array present between indices [ 1, 2 ]. It has 1 pair (2,2) such that 1 < 2 and arr[1] == arr[2]. - -[2,2,2] . It is the sub-array present between indices [ 0, 2 ]. It has 3 pairs (2,2), (2,2), and (2,2) at indices [0,1], [1,2], and [0,2] respectively such that the elements at the indices are equal.

Method 1 ( Brute Force )

A simple approach can be considering every sub-array and counting the number of good pairs which are present in the sub-array

For example, consider array arr = [ 1,1,1 ]. The good pairs present in the array are ( 1,1), (1,1), ( 1, 1) at indices [0,1], [1,2], and [0,2] respectively. Every single value in the array can combine with other equal values present on its right side in the array as we have to satisfy the condition [ i < j ] where i, j are indices of the good pair. If the frequency of an element 'ele' of the array is 'f', then there are f-1 other equal elements present on the right side with which element ele can combine and form a good pair. Similarly, other occurrences of ele can also combine with other equal elements present on its right side. So the total number of good pairs present in the array for the ele will be (f-1) + (f-2) + (f-3) + ...... + 2 + 1 = ( f *(f-1) ) / 2. Therefore, we will consider every subarray and count the number of good pairs present inside it for every element. We will take a hashmap for maintaining the frequency of every element.

Codes:

class Solution {
    public static void main(String[] args)
    {
        int arr[] = { 3, 1, 4, 3, 2, 2, 4 }, k = 2;
        long ans = countGoodPairs(arr, k);
        System.out.println(ans);
        // Output is 4 
    }

    static long countGoodPairs(int[] nums, int k)
    {
        // n is the length of input array arr
        int n = nums.length;
        // ans will hold the final answer
        long ans = 0;
        // i marks the beginning of subarray
        for (int i = 0; i < n; i++) {
            // Initialized a hashmap map for storing the
            // count of every element
            HashMap<Integer, Long> map = new HashMap<>();
            // goodPairs will hold the count of good pairs
            // present inside the given subarray
            long goodPairs = 0;
            // j marks the end of subarray beginning from i
            for (int j = i; j < n; j++) {

                int element = nums[j];
                map.put(element,
                        map.getOrDefault(element, 0l) + 1);
                // newFreq is the frequency of the  element
              //  oldFreq is the previous frequency of the  element
                long newFreq = map.get(element);
                  long oldFreq = newFreq - 1 ; 
                //  newPair represents the total number of good
                //  pairs possible for the element in the
                //  subarray starting from i and ending at j 
                long newPair = (newFreq * (newFreq - 1)) / 2;
                  long oldPair = ( oldFreq * ( oldFreq - 1 ))/ 2 ; 
                // the total count of goodPairs is increased
                // by the value of (  newPair - oldPair) 
                goodPairs = goodPairs + newPair - oldPair ;
                // if the total number of good pairs is
                // greater than k , than ans should be
                // increased
                if (goodPairs >= k) {
                    ans++;
                }
            }
        }
        return ans;
    }
}

Time complexity: O(n^2) for traveling every sub-array.

Auxiliary Space: O(n) for maintaining the hash map.

Method 2 ( Optimized Approach)

One thing to note down is the fact that we have to count the total sub-arrays and a sub-array is always continuous. Also if a good sub-array good_arr has at least k good pairs then all the other sub-arrays which have good_arr present inside them will have at least k good pairs. So we can avoid counting the number of good pairs in such sub-arrays which have the good_arr as their part because such sub-arrays will definitely contribute to the answer.

Assume array arr = [ __ ,__ , __ , 3 ,__ , 3 ,__ , __ ] . ( Here '' represents other elements present inside arr which are not equal to 3.) If the value of k = 1, then [ 3, __, 3] between indices [ 3, 5] can be a good sub-array, good_arr. Also, all the other sub-arrays which have good_arr as their part can surely contribute to the final answer. For example, here in the case of arr, if [ 3,_,3] is a good_arr then [__,3, __, 3], [ __, __,3, __, 3], [ __,__, __,3, __, 3] and many other sub-arrays which have good_arr as their part will also be good sub-array. So we can use the variable length sliding window approach here. Assume the window starts from index i and ends at index j. Initially, i and j are at index 0. Whenever good_arr satisfies the given condition then all the sub-arrays on the left side of good_arr which have good_arr present inside them will also contribute to the answer. We are not considering the sub-array on the right side of good_arr as they remain to be explored. As the window or the jth pointer moves forward such sub-arrays present on the right side will also be included in the answer. We will try to minimize the size of good_arr by following the given condition of minimum total good pairs. The purpose of minimizing the size of good_arr is that the good_arr should combine with the maximum number of other sub-arrays on its left side. Basically, we are also using a greedy approach here. We will try to increase the number of sub-arrays which include good_arr by moving the i pointer forward and thus decreasing the size of good_arr.

Below are the steps:

1) Start iterating the array and maintain two pointer variables i and j. A variable ans stores the final answer and a variable good_pairs stores the count of total good pairs present in the given sub-array window between indices i and j.

2) A hashmap map is also maintained for counting the frequency of elements present in the given window or the sub-array.

3) Initially, i and j are at index 0. The variable i will point to the starting of the sub-array window and j will point to the ending of the sub-array.

4) We will visit the jth element and will increase its frequency in the hashmap. If suppose the increased frequency is A then we will increase good_pairs by A-1 as the jth element can combine with A -1 other equal elements on its left side present in the sub-array.

5) If good_pairs <k, then we will move j forward.

6) If good_pairs >= k, then the sub-array between indices i and j is a good sub-array. Let us call it good_arr.

7) Now we will greedily try to decrease the size of good_arr as much as we can by moving i pointer towards j so that the good_arr can combine with the maximum number of sub-arrays on its left side.

8) For moving the i pointer forward we have to consider the given situation. Assume that the frequency of the ith element is B. This means that there are B-1 equal elements on the right side present in the good_arr with which the element at the ith index can combine. We will try to remove the ith element contribution ( B-1) from good_pairs before moving i pointer forward.

9) So if ( good_pairs - (B-1) >= k ) than we can move ith pointer forward. ( We are moving ith pointer forward because we are using a greedy approach that the good_arr must combine with maximum sub-arrays on its left side). Similarly, we will also keep trying for every new ith element.

10) Now if we are at index i, such that ( good_pairs - (B-1) < k ) this means that we can not remove ith element from our subarray.

11) Now i is pointing to the starting index of minimum possible size good_arr and j is pointing to the ending index. Our ans will be increased by ( i +1 ) as there are i other sub-arrays on the left side that have good_arr present inside them.

12) After increasing ans, we will move the jth pointer forward.

13) We will repeat the steps from 3 to 12 for every new jth element.

Below is the code for the problem. Assume the array arr = { 3, 1, 4, 3, 2, 2, 4} and the value of k = 2

Code:


import java.io.*;
import java.util.*;

class GFG {
    public static void main(String[] args)
    {
        int arr[] = { 3, 1, 4, 3, 2, 2, 4 }, k = 2;
        long res = countGoodPairs(arr, k);
        System.out.println(res);
        // Output is 4 
    }

    static long countGoodPairs(int[] nums, int k)
    {
        // A hahsmap 'map' is created for counting frequency
        // of every element.
        HashMap<Integer, Long> map = new HashMap<>();
        // 'ans' variable will store the final result.
        long ans = 0;
        // 'good_pairs' varibles store the total number of
        // pairs satisfying the condtion in the given window
        long good_pairs = 0;
        int n = nums.length;
        // 'i' is the starting point of window
        int i = 0;
        // j is the ending point of the window
        for (int j = 0; j < n; j++) {
            // update the map for every element.
            map.put(nums[j],
                    map.getOrDefault(nums[j], 0l) + 1);
            // 'a' represents the frequency of the given
            // nums[j] element
            long a = map.get(nums[j]);
            // good_pairs is updated by (a- 1) beacuse the
            // jth element can combine with other( a-1)
            // elemnts on its left side in the subarray
            good_pairs += a - 1;

            // if good_pairs  is more than or equal to k
            // than we can decrease the window size
            if (good_pairs >= (long)k) {
                // decresing the window size
                while (i < j) {

                    //  b is the number of pairs which can
                    //  be combined with nums[i] element
                    long b = map.get(nums[i]) - 1;
                    // if the freq can still be greater
                    // without nums[i] elemnent contribution
                    // than remove nums[i] contribution from
                    // the good_pairs
                    if (good_pairs - b >= (long)k) {
                        // update the map because now
                        // nums[i] is no longer part of the
                        // window
                        map.put(nums[i], b);
                        good_pairs -= b;
                        i++;
                    }
                    else
                        break;
                }
                // ans is updatde by ( i+1) as every other
                // array on left side of i including [i , j]
                // subarray will contribute to the answer.
                ans += (i + 1);
            }
        }
        return ans;
    }
}

Time Complexity: O(n). Every time window or the subarray is decreased to as minimum as possible. Therefore the time complexity will be nearly O(n)

Auxiliary Space: O(n) as hash-map is maintained for maintaining the frequency for each element