Monday 14 October 2013

Quick select: Select Kth smallest element in an array of size n in O(n) time.


#include <iostream>
using namespace std;

inline void swap(int arr[],int i,int j) {
      int temp = arr[i];
      arr[i] = arr[j];
      arr[j] = temp;
}

int randint(int len) {
      srand(time(NULL));
    //srand((unsigned)rdtsc());
    return (rand() % (len) + 0);
}

int randomPartition(int arr[],int start , int end) {
      int pivotPos = randint(end-start) + start;
      swap(arr,start,pivotPos);
      
      int pivot = arr[start], i = start+1, j= start+1;
      pivotPos = start;
      for(;j<=end; j++) {
            if(arr[j] < pivot) {
                  swap(arr,i,j);
                  i++;
            }
      }
      swap(arr,pivotPos,i-1);
      pivotPos = i-1;
      return pivotPos;
}

int Rselect(int arr[],int start, int end, int k) {
    int pivotPos = randomPartition(arr,start,end);
    if(pivotPos == k) 
        return arr[k];
    else if(pivotPos < k)
        return Rselect(arr,pivotPos+1,end,k);
    else
        return Rselect(arr,start,pivotPos-1,k);
}

int main() {
    int arr[] = {40,1,10,70,2,5,30,9,8,6};
    int size = sizeof arr/sizeof arr[0];
    cout<<"Kth greatest = "<<Rselect(arr,0,size-1,3);
    system("pause");
    return 0;
}



2 comments:

  1. Hows is it O(n)... you are using quick sort..so it should be O(nlogn)

    ReplyDelete
  2. It is not quick sort but selection step of quick sort.
    If you observe it carefully it does run in O(n) time, although agree worst case is O(n2).

    Let me try to explain, how it works in O(n) time.
    At each step we are picking the pivot element around which we are partitioning the input array.
    Now there are several ways of selecting the pivot element (taking first element as pivot etc), but I have generated a random number between start and end index and used it as pivot element.

    what is the resultant array after one iteration of quick select?
    The result after one iteration of quick select is :
    All elements smaller than pivot element are on the left of pivot and all elements greater than pivot are on right of pivot and pivot position is returned.

    Now lets say we were searching for kth largest element in an array and quick select return pivot position, now there are 3 possible cases:
    1. Pivot position = k
    This is lucky step, if this happens then we have found kth smallest element.
    2. Pivot position > k
    In this case all elements smaller than pivot position are on left of pivot position and k < pivotPostion hence we are interested in only left sub-array and we simply discard right subarray.
    3. Pivot position < k
    In this case kth smallest element is on left hand side of pivot and we simply discard right hand sub-array and repeat step in first half.

    In quick sort we don't discard any sub-array but consider both.
    Hence recurrense relation in quick-sort is :
    T(N) = 2T(N/2) + O(N) = O(NlogN).

    But for quick select recurrence relation is :
    T(N) = T(N/2) + O(N) = O(N)

    Above can be verified by using Master Method.
    Above implementation assume, that there is 50-50 partition on each recursive step.
    If array is sorted in increasing/decreasing order and we are picking the first element as pivot then it is worst case and complexity = O(N2).

    But I have used random number generator to select pivot, and it is highly unlikely that each time it will select the smallest or largest element in array, hence probability of worst case is reduced significantly by using random pivot.

    There is another method for selecting kth smallest element that provides a tighter bound , a linear upper bound i.e. worst case time of O(N).
    It guarantees 30-70 split.
    Please refer median of medians algorithm for that.

    Median of median does has worst case running time of O(N) but it is found in practise quick select is faster as compare to median of median, because of small constant factor and by using random pivot worst case behavior is highly highly unlikely.

    Hope this helps! :)

    ReplyDelete