Quicksort

We’ve looked a variety of in-efficient sorting algorithms, today we’ll look at Quicksort (aka. partition exchange sort), as a first foray into faster and more frequently used sorting algorithms.

Quicksort[1] is a comparison sort using a divide and conquer algorithm, developed by Tony Hoare[2] in 1960. It recursively divides the list into smaller lists around a pivot value and sorts them, which means much smaller data sets when actually sorting. It has a worst case runtime of (O(n2)), but an average and more common runtime of (O(nlogn))

How do it…

We first need an algorithm to determine the pivot value. There are many algorithms, but we’ll use a recommended technique, taking the median of the first value, last value, and middle value in an array:

function getPivot (arr) {
	var iValFirst = arr[0],
			iLast = arr.length - 1,
			iValLast = arr[iLast],
			iMid = Math.floor(arr.length / 2),
			iValMid = arr[iMid];

	// first position is lowest
	if (iValFirst < iValLast && iValFirst < iValMid) {
		return iValMid < iValLast ? iMid : iLast;
	}
	// mid position is lowest
	else if (iMid < iValLast && iMid < iValMid) {
		return iValFirst < iValLast ? 0 : iLast;
	}
	// last position is lowest
	else {
		return iValFirst < iValMid ? 0 : iMid;
	}
}

Here is a the quicksort algorithm in JavaScript:

function quicksort (arr) {
	var aLess = [],
        aGreater = [],
        i,
        j,
        iPivot,
        iPivotVal;

	// array of length zero or one is already sorted
	if (arr.length <= 1) {
		return arr;
	}

	iPivot = getPivot(arr);
	iPivotVal = arr[iPivot];

	// the function to process the value and compare it to
	// the pivot value and put into the correct array
	function compVal(iVal) {
		(iVal <= iPivotVal ? aLess : aGreater).push(iVal);
	}

	// compare values before the pivot
	for (i = 0, j = iPivot; i < j; i++) {
		compVal(arr[i]);
	}

	// compare values after the pivot
	for (i = iPivot + 1, j = arr.length; i < j; i++) {
		compVal(arr[i]);
	}

	// create a new list from sorted sublists around the pivot value
	return quicksort(aLess).concat([iPivotVal], quicksort(aGreater));
}

It is a little hard to follow, but this demo chooses a pivot (blue), then puts the values into the two sub-arrays (red), before merging the arrays around the pivot, placing the pivot in its final position (black):


Restart Quicksort With a New Array

If we take a 10,000 element array and sort it, here is how quicksort compares to native array sort (prints to the console):

Quicksort Comparison Tests

How it works…

The Quicksort algorithm breaks the array into two parts around a pivot value, pushing smaller values into one array and larger values into a second array. These two arrays will then be quicksorted recursively, before concatenating them around the pivot value. Generally, this strategy will quickly sort the array, but the efficiency of the algorithm depends on how well the pivot value is calculated. The more equal in size the smaller and larger value arrays are, the more efficient the algorithm.

There are many strategies for choosing a pivot value, such as picking a dedicated position (first or middle are common), or randomly choosing a position. However, none of these do anything to increase the likelihood that the two arrays of values are about the same size. The recommended strategy for finding the pivot is to take the median of the first value, middle value, and last value in the array. This is more likely to get a value that is near the middle when sorting. These values are also better than randomly choosing three, because they sort already sorted arrays faster.

For a random distribution quicksort tends to perform better than most sorting algorithms, because it quickly divides the array into smaller arrays that tend to sort quickly. However, it will perform quadratically on array with mostly duplicate values, as the pivot does not help much and the split arrays are unbalanced. If you compared it to the built in sorting algorithm (above), you⁏ll notice only a few milliseconds of difference between the two on a large array.

Lastly, Quicksort is not very efficient on small data sets since its constant operation (slicing and merging arrays) is more expensive than other sorting algorithms. One of the best optimizations is to determine a threshold k where the smaller arrays are sorted using Insertion Sort instead. The k value is best determined by experimenting on your data set.

References

  1. Quicksort Wikipedia article
  2. Stony Hoare