It's been a while since I reviewed my understanding of some of the fundamental sorting algorithms so I thought it would be fun to implement a few of them and compare their performance. In this post I have provided code demonstrating Selection Sort, Insertion Sort, Merge Sort and Quick Sort along with some unit tests to compare their performance for different input sizes.
This was an interesting exercise and pretty eye opening to see the difference in performance between the sorting algorithms. The input for each algorithm is an array of items that can be compared to one another and have a final order. The unit tests supplied allow you to see the performance of each algorithm when run against small and large inputs. On small inputs the difference is imperceptible but as the input grows you will find the performance gaps start to emerge. These algorithms don't take account of some of the edge cases and there is room for improvement. Some edge cases might include input that is already partially sorted or has many duplicate keys.
This algorithm starts with the first element and iterates over the remaining elements in the input to find the lowest value item. It then swaps the smallest item with the first item and moves its pointer to the second item and repeats the above process. This means that for each index i in an array of size n, the item at i is compared to every item at indexes between i + 1 and n - 1 inclusive. Each iteration of the outer loop moves one item into its final position. This involves a lot of comparisons and is relatively slow.
This algorithm starts by again working its way through from left to right. It picks up the next item and compares it to all previous (sorted) items. If a preceding item is smaller then it swaps it, otherwise it breaks out of the inner loop since the item is in the correct position amongst the processed items and moves onto the next item.
These two algorithms both sort the input in place and so are good choices when sorting in place is a requirement. However, Selection Sort does more comparisons and Insertion Sort does more writes so one may be preferable over the other depending on the cost of comparison vs cost of copying.
Merge sort relies on the concept of divide-and-conquer. It recursively splits up the input into smaller arrays and sorts these sub arrays. It then merges the already-sorted sub-arrays as the recursion unravels. It does use additional space to copy arrays so may not be a good choice where memory might be a constraint.
You might have guessed by its name but Quick Sort is the fastest choice here for larger inputs (with some caveats discussed later). The first thing that is done is a random shuffle to avoid the worst case of an already sorted input - which can occur quite frequently in practice. Then an item is chosen to partition the input and any item smaller than the partition item is placed to the left of the partition item, any item greater than the partition item is placed to it's right. We repeat this for the list on either side until everything is sorted.
Quick Sort is the fastest algorithm here as long as the items being sorted fit in memory. If your input spills over to disk Merge Sort can have an advantage as it uses sequential reads, something that disks (at least the spinning rust kind) are good at. Quick Sort also performs the sort in place so does not use the additional storage required by Merge Sort.
You can find all the code for this on GitHub along with the unit tests.