Array range queries for searching an element

Hello Everyone,

Given an array of N elements and Q queries of the form L R X. For each query, you have to output if the element X exists in the array between the indices L and R(included).

Examples :

Input : N = 5
arr = [1, 1, 5, 4, 5]
Q = 3
1 3 2
2 5 1
3 5 5
Output : No
Yes
Yes
Explanation :
For the first query, 2 does not exist between the indices 1 and 3.
For the second query, 1 exists between the indices 2 and 5.
For the third query, 5 exists between the indices 3 and 5.

Efficient Approach(Using Mo’s Algorithm) :
Mo’s algorithm is one of the finest applications for square root decomposition.
It is based on the basic idea of using the answer to the previous query to compute the answer for the current query. This is made possible because the Mo’s algorithm is constructed in such a way that if F([L, R]) is known, then F([L + 1, R]), F([L – 1, R]), F([L, R + 1]) and F([L, R – 1]) can be computed easily, each in O(F) time.

Answering queries in the order they are asked, then the time complexity is not improved to what is needed to be. To reduce the time complexity considerably, the queries are divided into blocks and then sorted. The exact algorithm to sort the queries is as follows :

  • Denote BLOCK_SIZE = sqrt(N)
  • All the queries with the same L/BLOCK_SIZE are put in the same block
  • Within a block, the queries are sorted based on their R values
  • The sort function thus compares two queries, Q1 and Q2 as follows:
    Q1 must come before Q2 if:
    1. L1/BLOCK_SIZE<L2/BLOCK_SIZE
    2. L1/BLOCK_SIZE=L2/BLOCK_SIZE and R1<R2

After sorting the queries, the next step is to compute the answer to the first query and consequently answer rest of the queries. To determine if a particular element exists or not, check the frequency of the element in that range. A non zero frequency confirms the existence of the element in that range.
To store the frequency of the elements, STL map has been used in the following code.
In the example given, first query after sorting the array of queries is {0, 2, 2}. Hash the frequencies of the elements in [0, 2] and then check the frequency of the element 2 from the map. Since, 2 occurs 0 times, print “No”.
While processing the next query, which is {1, 4, 1} in this case, decrement the frequencies of the elements in the range [0, 1) and increment the frequencies of the elements in range [3, 4]. This step gives the frequencies of elements in [1, 4] and it can easily be seen from the map that 1 exists in this range.

Time complexity :
The pre-processing part, that is sorting the queries takes O(m Log m) time.
The index variable for R changes at most O(n * \sqrt{n}) times throughout the run and that for L changes its value at most O(m * \sqrt{n}) times. Hence, processing all queries takes O(n * \sqrt{n}) + O(m * \sqrt{n}) = O((m+n) * \sqrt{n}) time.

Below is the C++ implementation of the above idea :

// CPP code to determine if the element

// exists for different range queries

#include <bits/stdc++.h>

using namespace std;

// Variable to represent block size.

// This is made global, so compare()

// of sort can use it.

int block;

// Structure to represent a query range

struct Query

{

int L, R, X;

};

// Function used to sort all queries so

// that all queries of same block are

// arranged together and within a block,

// queries are sorted in increasing order

// of R values.

bool compare(Query x, Query y)

{

// Different blocks, sort by block.

if (x.L / block != y.L / block)

return x.L / block < y.L / block;

// Same block, sort by R value

return x.R < y.R;

}

// Determines if the element is present for all

// query ranges. m is number of queries

// n is size of array a[].

void queryResults( int a[], int n, Query q[], int m)

{

// Find block size

block = ( int ) sqrt (n);

// Sort all queries so that queries of same

// blocks are arranged together.

sort(q, q + m, compare);

// Initialize current L, current R

int currL = 0, currR = 0;

// To store the frequencies of

// elements of the given range

map< int , int > mp;

// Traverse through all queries

for ( int i = 0; i < m; i++) {

// L and R values of current range

int L = q[i].L, R = q[i].R, X = q[i].X;

// Decrement frequencies of extra elements

// of previous range. For example if previous

// range is [0, 3] and current range is [2, 5],

// then the frequencies of a[0] and a[1] are decremented

while (currL < L)

{

mp[a[currL]]--;

currL++;

}

// Increment frequencies of elements of current Range

while (currL > L)

{

mp[a[currL - 1]]++;

currL--;

}

while (currR <= R)

{

mp[a[currR]]++;

currR++;

}

// Decrement frequencies of elements of previous

// range. For example when previous range is [0, 10]

// and current range is [3, 8], then frequencies of

// a[9] and a[10] are decremented

while (currR > R + 1)

{

mp[a[currR - 1]]--;

currR--;

}

// Print if X exists or not

if (mp[X] != 0)

cout << X << " exists between [" << L

<< ", " << R << "] " << endl;

else

cout << X << " does not exist between ["

<< L << ", " << R << "] " << endl;

}

}

// Driver program

int main()

{

int a[] = { 1, 1, 5, 4, 5 };

int n = sizeof (a) / sizeof (a[0]);

Query q[] = { { 0, 2, 2 }, { 1, 4, 1 }, { 2, 4, 5 } };

int m = sizeof (q) / sizeof (q[0]);

queryResults(a, n, q, m);

return 0;

}

Output:

2 does not exist between [0, 2] 1 exists between [1, 4] 5 exists between [2, 4]