Sort a nearly sorted (or K sorted) array

Hello Everyone,

Given an array of n elements, where each element is at most k away from its target position, devise an algorithm that sorts in O(n log k) time. For example, let us consider k is 2, an element at index 7 in the sorted array, can be at indexes 5, 6, 7, 8, 9 in the given array.
Examples:

Input : arr[] = {6, 5, 3, 2, 8, 10, 9}
k = 3
Output : arr[] = {2, 3, 5, 6, 8, 9, 10}

Input : arr[] = {10, 9, 8, 7, 4, 70, 60, 50}
k = 4
Output : arr[] = {4, 7, 8, 9, 10, 50, 60, 70}

We can use Insertion Sort to sort the elements efficiently. Following is the C code for standard Insertion Sort.

/* Function to sort an array using insertion sort*/

void insertionSort( int A[], int size)

{

int i, key, j;

for (i = 1; i < size; i++)

{

key = A[i];

j = i-1;

/* Move elements of A[0..i-1], that are

greater than key, to one

position ahead of their current position.

This loop will run at most k times */

while (j >= 0 && A[j] > key)

{

A[j+1] = A[j];

j = j-1;

}

A[j+1] = key;

}

}

The inner loop will run at most k times. To move every element to its correct place, at most k elements need to be moved. So overall complexity will be O(nk)
We can sort such arrays more efficiently with the help of Heap data structure . Following is the detailed process that uses Heap.

  1. Create a Min Heap of size k+1 with first k+1 elements. This will take O(k) time
  2. One by one remove min element from heap, put it in result array, and add a new element to heap from remaining elements.
    Removing an element and adding a new element to min heap will take log k time. So overall complexity will be O(k) + O((n-k) * log(k)).

// A STL based C++ program to sort a nearly sorted array.

#include <bits/stdc++.h>

using namespace std;

// Given an array of size n, where every element

// is k away from its target position, sorts the

// array in O(n logk) time.

int sortK( int arr[], int n, int k)

{

// Insert first k+1 items in a priority queue (or min

// heap)

//(A O(k) operation). We assume, k < n.

priority_queue< int , vector< int >, greater< int > > pq(arr, arr + k + 1);

// i is index for remaining elements in arr[] and index

// is target index of for current minimum element in

// Min Heap 'pq'.

int index = 0;

for ( int i = k + 1; i < n; i++) {

arr[index++] = pq.top();

pq.pop();

pq.push(arr[i]);

}

while (pq.empty() == false ) {

arr[index++] = pq.top();

pq.pop();

}

}

// A utility function to print array elements

void printArray( int arr[], int size)

{

for ( int i = 0; i < size; i++)

cout << arr[i] << " " ;

cout << endl;

}

// Driver program to test above functions

int main()

{

int k = 3;

int arr[] = { 2, 6, 3, 12, 56, 8 };

int n = sizeof (arr) / sizeof (arr[0]);

sortK(arr, n, k);

cout << "Following is sorted array" << endl;

printArray(arr, n);

return 0;

}

Output:

Following is sorted array 2 3 6 8 12 56

The Min Heap based method takes O(n log k) time and uses O(k) auxiliary space.