Sorting Algorithms Overview and Analysis
Sorting Algorithms Overview and Analysis
Unit 8 Sorting
8.1 Introduction
Sorting is a technique of organizing the data either in ascending or descending order i.e., bringing
some order lines in the data. Sort methods are very important in data structures.
Sorting can be performed on any one or combination of one or more attributes present in each record.
It is very easy and efficient to perform searching if data is stored in sorting order. The sorting is
performed according to the key value of each record. Depending up on the makeup of key, records
can be stored either numerically or alphanumerically. In numerical sorting, the records arranged in
ascending or descending order according to the numeric value of the key.
Let A be a list of n elements a1, a2, a3 … an in memory. Sorting A refers to the operation of rearranging
the contents of A so that they are increasing in order, that is, so that a1 <= a2 <=a3 <=...<=an. Since A
has n elements, there are n! ways that the contents can appear in A. These ways corresponding
precisely to the n! permutations of 1,2, 3…n. Accordingly each sorting algorithm must take care of
these n! possibilities.
Internal Sorting
Internal sorting method is used when small amount of data has to be sorted. In this method, the data
to be sorted is stored in the main memory (RAM). Internal sorting method can access records
randomly. For example, bubble sort, insertion sort, selection sort, shell sort, quick sort, radix sort,
heap sort etc.
External Sorting
External sorting method is used when large amount of data has to be sorted. In this method, the data
to be sorted is stored in the main memory as well as in the secondary memory such as disk. External
sorting methods an access records only in a sequential order. Ex: Merge Sort, Multi way Mage Sort.
Although it is simple to use, it is primarily used as an educational tool because the performance of
bubble sort is poor in the real world. It is not suitable for large data sets. The average and worst-case
complexity of Bubble sort is O(n2), where n is a number of items. Bubble sort is majorly used where
• complexity does not matter
• simple and short code is preferred
Algorithm
In the algorithm given below, suppose arr is an array of n element. The assumed swap function in the
algorithm will swap the values of given array elements.
BubbleSort (arr)
{
for all array elements
if (arr[i] > arr[i+1])
swap(arr[i], arr[i+1])
return arr
}
Remaining Iteration
The same process goes on for the remaining iterations. After each iteration, the largest element
among the unsorted elements is placed at the end. In each iteration, the comparison takes place up
to the last unsorted element.
The array is sorted when all the unsorted elements are placed at their correct positions.
Time Complexity
Best Case Complexity - It occurs when there is no sorting required, i.e. the array is already sorted.
Average Case Complexity - It occurs when the array elements are in jumbled order that is not properly
ascending and not properly descending.
Worst Case Complexity - It occurs when the array elements are required to be sorted in reverse order.
That means suppose you have to sort the array elements in ascending order, but its elements are in
descending order.
Space Complexity
The space complexity of bubble sort is O(1) and it is stable. It is because, in bubble sort, an extra
variable is required for swapping. The space complexity of optimized bubble sort is O(2). It is because
two extra variables are required in optimized bubble sort.
In the bubble sort algorithm, comparisons are made even when the array is already sorted. Because
of that, the execution time increases. To solve it, we can use an extra variable swapped. It is set
to true if swapping requires; otherwise, it is set to false. It will be helpful, as suppose after an iteration,
if there is no swapping required, the value of variable swapped will be false. It means that the
elements are already sorted, and no further iterations are required. This method will reduce the
execution time and also optimizes the bubble sort.
Algorithm
bubbleSort(array)
n = length(array)
repeat
swapped = false
for i = 1 to n - 1
if array[i - 1] > array[i], then
swap(array[i - 1], array[i])
swapped = true
end if
end for
n=n-1
until not swapped
end bubbleSort
Divide: In divide, first pick a pivot element. After that, partition or rearrange the array into two sub-
arrays such that each element in the left sub-array is less than or equal to the pivot element and each
element in the right sub-array is larger than the pivot element.
Quicksort picks an element as pivot, and then it partitions the given array around the picked pivot
element. In quick sort, a large array is divided into two arrays in which one holds values that are smaller
than the specified value (Pivot), and another array holds the values that are greater than the pivot.
After that, left and right sub-arrays are also partitioned using the same approach. It will continue until
the single element remains in the sub-array.
Picking a good pivot is necessary for the fast implementation of quicksort. However, it is typical to
determine a good pivot. Some of the ways of choosing a pivot are as follows -
• Pivot can be random, i.e. select the random pivot from the given array.
• Pivot can either be the rightmost element or the leftmost element of the given array.
• Select median as the pivot element.
Algorithm
In the given array, we consider the leftmost element as pivot. So, in this case, a[left] = 24, a[right] =
27 and a[pivot] = 24. Since, pivot is at left, so algorithm starts from right and move towards left.
Now, a[pivot] < a[right], so algorithm moves forward one position towards left, i.e. –
Now, a[left] = 24, a[right] = 19, and a[pivot] = 24. Because, a[pivot] > a[right], so, algorithm will swap
a[pivot] with a[right], and pivot moves to right, as
Now, a[left] = 19, a[right] = 24, and a[pivot] = 24. Since, pivot is at right, so algorithm starts from left
and moves to right. As a[pivot] > a[left], so algorithm moves one position to right as
Now, a[left] = 9, a[right] = 24, and a[pivot] = 24. As a[pivot] > a[left], so algorithm moves one position
to right as -
Now, a[left] = 29, a[right] = 24, and a[pivot] = 24. As a[pivot] < a[left], so, swap a[pivot] and a[left],
now pivot is at left, i.e. –
Since, pivot is at left, so algorithm starts from right, and move to left. Now, a[left] = 24, a[right] = 29,
and a[pivot] = 24. As a[pivot] < a[right], so algorithm moves one position to left, as –
Now, a[pivot] = 24, a[left] = 24, and a[right] = 14. As a[pivot] > a[right], so, swap a[pivot] and a[right],
now pivot is at right, i.e. –
Now, a[pivot] = 24, a[left] = 14, and a[right] = 24. Pivot is at right, so the algorithm starts from left and
move to right.
Now, a[pivot] = 24, a[left] = 24, and a[right] = 24. So, pivot, left and right are pointing the same
element. It represents the termination of procedure. Element 24, which is the pivot element is placed
at its exact position. Elements that are right side of element 24 are greater than it, and the elements
that are left side of element 24 are smaller than it.
Now, in a similar manner, quick sort algorithm is separately applied to the left and right sub-arrays.
After sorting gets done, the array will be –
Quicksort Complexity
Time Complexity
Best Case Complexity - In Quicksort, the best-case occurs when the pivot element is the middle
element or near to the middle element. The best-case time complexity of quicksort is O (n*log n).
Average Case Complexity - It occurs when the array elements are in jumbled order that is not properly
ascending and not properly descending. The average case time complexity of quicksort is O (n*log n).
Worst Case Complexity - In quick sort, worst case occurs when the pivot element is either greatest or
smallest element. Suppose, if the pivot element is always the last element of the array, the worst case
would occur when the given array is sorted already in ascending or descending order. The worst-case
time complexity of quicksort is O(n2). Though the worst-case complexity of quicksort is more than
other sorting algorithms such as Merge sort and Heap sort, still it is faster in practice. Worst case in
quick sort rarely occurs because by changing the choice of pivot, it can be implemented in different
ways. Worst case in quicksort can be avoided by choosing the right pivot element.
Space Complexity
The space complexity of quicksort is O(n*log n) and its not stable.
Disadvantages
• It is somewhat complex method for sorting.
• It is little hard to implement than other sorting methods
• It does not perform well in the case of small group of elements.
The sub-lists are divided again and again into halves until the list cannot be divided further. Then we
combine the pair of one element lists into two-element lists, sorting them in the process. The sorted
two-element pairs are merged into the four-element lists, and so on until we get the sorted list.
Suppose we had to sort an array A. A subproblem would be to sort a sub-section of this array starting
at index p and ending at index r, denoted as A[p..r].
Divide
If q is the half-way point between p and r, then we can split the subarray A[p..r] into two
arrays A[p..q] and A[q+1, r].
Conquer
In the conquer step, we try to sort both the subarrays A[p..q] and A[q+1, r]. If we haven't yet reached
the base case, we again divide both these subarrays and try to sort them.
Combine
When the conquer step reaches the base step and we get two sorted subarrays A[p..q] and A[q+1,
r] for array A[p..r], we combine the results by creating a sorted array A[p..r] from two sorted
subarrays A[p..q] and A[q+1, r].
Algorithm
In the following algorithm, arr is the given array, beg is the starting element, and end is the last
element of the array.
The important part of the merge sort is the MERGE function. This function performs the merging of
two sorted sub-arrays that are A[beg…mid] and A[mid+1…end], to build one sorted
array A[beg…end]. So, the inputs of the MERGE function are A[], beg, mid, and end.
The array A[0..5] contains two sorted subarrays A[0..3] and A[4..5]. Let us see how the merge function
will merge the two arrays.
Step 3: Until we reach the end of either L or M, pick larger among elements L and M and place them
in the correct position at A[p..r]
Step 4: When we run out of elements in either L or M, pick up the remaining elements and put in
A[p..r]
Copy the remaining elements from the first array to main subarray
This step would have been needed if the size of M was greater than L. At the end of the merge function,
the subarray A[p..r] is sorted.
Time Complexity
Best Case Complexity - It occurs when there is no sorting required, i.e. the array is already sorted. The
best-case time complexity of merge sort is O(n*logn).
Average Case Complexity - It occurs when the array elements are in jumbled order that is not properly
ascending and not properly descending. The average case time complexity of merge sort is O(n*logn).
Worst Case Complexity - It occurs when the array elements are required to be sorted in reverse order.
That means suppose you have to sort the array elements in ascending order, but its elements are in
descending order. The worst-case time complexity of merge sort is O(n*logn).
Space Complexity
The space complexity of merge sort is O(n). It is because, in merge sort, an extra variable is required
for swapping. It is also stable.
Advantages
• Merge sort is stable sort.
• It is easy to understand.
• It gives better performance.
Disadvantages
• It requires extra memory space.
• Copy of elements to temporary array.
• It requires additional array.
• It is slow process.
In selection sort, the first smallest element is selected from the unsorted array and placed at the first
position. After that second smallest element is selected and placed in the second position. The process
continues until the array is entirely sorted.
The average and worst-case complexity of selection sort is O(n2), where n is the number of items. Due
to this, it is not suitable for large data sets. Selection sort is generally used when
Algorithm
SELECTION SORT(arr, n)
Step 1: Repeat Steps 2 and 3 for i = 0 to n-1
Step 2: Call SMALLEST(arr, i, n, pos)
Step 3: Swap arr[i] with arr[pos]
[END OF LOOP]
Step 4: EXIT
Now, for the first position in the sorted array, the entire array is to be scanned sequentially. At
present, 12 is stored at the first position, after searching the entire array, it is found that 8 is the
smallest value.
So, swap 12 with 8. After the first iteration, 8 will appear at the first position in the sorted array.
For the second position, where 29 is stored presently, we again sequentially scan the rest of the items
of unsorted array. After scanning, we find that 12 is the second lowest element in the array that should
be appeared at second position.
Now, swap 29 with 12. After the second iteration, 12 will appear at the second position in the sorted
array. So, after two iterations, the two smallest values are placed at the beginning in a sorted way.
The same process is applied to the rest of the array elements. Now, we are showing a pictorial
representation of the entire sorting process.
Time Complexity
Best Case Complexity - It occurs when there is no sorting required, i.e. the array is already sorted. The
best-case time complexity of selection sort is O(n2).
Average Case Complexity - It occurs when the array elements are in jumbled order that is not properly
ascending and not properly descending. The average case time complexity of selection sort is O(n2).
Worst Case Complexity - It occurs when the array elements are required to be sorted in reverse order.
That means suppose you have to sort the array elements in ascending order, but its elements are in
descending order. The worst-case time complexity of selection sort is O(n2).
Space Complexity
The space complexity of selection sort is O (1). It is because, in selection sort, an extra variable is
required for swapping. It is stable also.
The same approach is applied in insertion sort. The idea behind the insertion sort is that first take one
element, iterate it through the sorted array. Although it is simple to use, it is not appropriate for large
data sets as the time complexity of insertion sort in the average case and worst case is O(n2), where n
is the number of items. Insertion sort is less efficient than the other sorting algorithms like heap sort,
quick sort, merge sort, etc.
Insertion sort is twice as fast as bubble sort. In insertion sort the elements comparisons are as less as
compared to bubble sort. In this comparison the value until all prior elements are less than the
compared values are not found. This means that all the previous values are lesser than compared
value. Insertion sort is good choice for small values and for nearly sorted values.
Algorithm
The first element in the array is assumed to be sorted. Take the second element and store it separately
in key. Compare key with the first element. If the first element is greater than key, then key is placed
in front of the first element.
If the first element is greater than key, then key is placed in front of the first element.
Now, the first two elements are sorted. Take the third element and compare it with the elements on
the left of it. Placed it just behind the element smaller than it. If there is no element smaller than it,
then place it at the beginning of the array.
Time Complexity
Best Case Complexity - It occurs when there is no sorting required, i.e. the array is already sorted. The
best-case time complexity of insertion sort is O(n).
Average Case Complexity - It occurs when the array elements are in jumbled order that is not properly
ascending and not properly descending. The average case time complexity of insertion sort is O(n2).
Worst Case Complexity - It occurs when the array elements are required to be sorted in reverse order.
That means suppose you have to sort the array elements in ascending order, but its elements are in
descending order. The worst-case time complexity of insertion sort is O(n2).
Space Complexity
The space complexity of insertion sort is O(1) and stable. It is because, in insertion sort, an extra
variable is required for swapping.
Disadvantages
• It is less efficient on list containing more number of elements.
• As the number of elements increases the performance of program would be slow
Shell sort is the generalization of insertion sort, which overcomes the drawbacks of insertion sort by
comparing elements separated by a gap of several positions.
It is a sorting algorithm that is an extended version of insertion sort. Shell sort has improved the
average time complexity of insertion sort. As similar to insertion sort, it is a comparison-based and in-
place sorting algorithm. Shell sort is efficient for medium-sized data sets.
In insertion sort, at a time, elements can be moved ahead by one position only. To move an element
to a far-away position, many movements are required that increase the algorithm's execution time.
But shell sort overcomes this drawback of insertion sort. It allows the movement and swapping of far-
away elements as well.
This algorithm first sorts the elements that are far away from each other, then it subsequently reduces
the gap between them. This gap is called as interval.
The performance of the shell sort depends on the type of sequence used for a given input array.
Algorithm
We will use the original sequence of shell sort, i.e., N/2, N/4,....,1 as the intervals. In the first loop, n is
equal to 8 (size of the array), so the elements are lying at the interval of 4 (n/2 = 4). Elements will be
compared and swapped if they are not in order.
Here, in the first loop, the element at the 0th position will be compared with the element at
4th position. If the 0th element is greater, it will be swapped with the element at 4th position.
Otherwise, it remains the same. This process will continue for the remaining elements.
At the interval of 4, the sub lists are {33, 12}, {31, 17}, {40, 25}, {8, 42}.
Now, we have to compare the values in every sub-list. After comparing, we have to swap them if
required in the original array. After comparing and swapping, the updated array will look as follows -
In the second loop, elements are lying at the interval of 2 (n/4 = 2), where n = 8. Now, we are taking
the interval of 2 to sort the rest of the array. With an interval of 2, two sublists will be generated - {12,
25, 33, 40}, and {17, 8, 31, 42}.
Now, we again have to compare the values in every sub-list. After comparing, we have to swap them
if required in the original array. After comparing and swapping, the updated array will look as follows
-
In the third loop, elements are lying at the interval of 1 (n/8 = 1), where n = 8. At last, we use the
interval of value 1 to sort the rest of the array elements. In this step, shell sort uses insertion sort to
sort the array elements.
Time Complexity
Best Case Complexity - It occurs when there is no sorting required, i.e., the array is already sorted.
The best-case time complexity of Shell sort is O(n*logn).
Average Case Complexity - It occurs when the array elements are in jumbled order that is not properly
ascending and not properly descending. The average case time complexity of Shell sort is O(n*log(n)2).
Worst Case Complexity - It occurs when the array elements are required to be sorted in reverse order.
That means suppose you have to sort the array elements in ascending order, but its elements are in
descending order. The worst-case time complexity of Shell sort is O(n2).
Space Complexity
The space complexity of Shell sort is O(1) and its unstable.
The process of radix sort works similar to the sorting of student's names, according to the alphabetical
order. In this case, there are 26 radix formed due to the 26 alphabets in English. In the first pass, the
names of students are grouped according to the ascending order of the first letter of their names.
After that, in the second pass, their names are grouped according to the ascending order of the second
letter of their name. And the process continues until we find the sorted list.
Algorithm
radixSort(arr)
max = largest element in the given array
d = number of digits in the largest element (or, max)
Now, create d buckets of size 0 - 9
for i -> 0 to d
sort the array elements using counting sort (or any stable sort) according to the digits at
the ith place
First, we have to find the largest element (suppose max) from the given array. Suppose 'x' be the
number of digits in max. The 'x' is calculated because we need to go through the significant places of
all elements. After that, go through one by one each significant place. Here, we have to use any stable
sorting algorithm to sort the digits of each significant place.
In the given array, the largest element is 736 that have 3 digits in it. So, the loop will run up to three
times (i.e., to the hundreds place). That means three passes are required to sort the array. Now, first
sort the elements on the basis of unit place digits (i.e., x = 0). Here, we are using the counting sort
algorithm to sort the elements.
Pass 1:
In the first pass, the list is sorted on the basis of the digits at 0's place.
Pass 2:
In this pass, the list is sorted on the basis of the next significant digits (i.e., digits at 10th place).
Pass 3:
In this pass, the list is sorted on the basis of the next significant digits (i.e., digits at 100th place).
Time Complexity
Best Case Complexity - It occurs when there is no sorting required, i.e. the array is already sorted. The
best-case time complexity of Radix sort is Ω (n+k).
Average Case Complexity - It occurs when the array elements are in jumbled order that is not properly
ascending and not properly descending. The average case time complexity of Radix sort is θ(nk).
Worst Case Complexity - It occurs when the array elements are required to be sorted in reverse order.
That means suppose you have to sort the array elements in ascending order, but its elements are in
descending order. The worst-case time complexity of Radix sort is O(nk).
Radix sort is a non-comparative sorting algorithm that is better than the comparative sorting
algorithms. It has linear time complexity that is better than the comparative algorithms with
complexity O (n logn).
Space Complexity
The space complexity of Radix sort is O (n + k) and is stable.
What is a heap?
A heap is a complete binary tree, and the binary tree is a tree in which the node can have the utmost
two children. A complete binary tree is a binary tree in which all the levels except the last level, i.e.,
leaf node, should be completely filled, and all the nodes should be left-justified.
Algorithm
HeapSort(arr)
BuildMaxHeap(arr)
for i = length(arr) to 2
swap arr[1] with arr[i]
heap_size[arr] = heap_size[arr] ? 1
MaxHeapify(arr,1)
End
BuildMaxHeap(arr)
heap_size(arr) = length(arr)
for i = length(arr)/2 to 1
MaxHeapify(arr,i)
End
MaxHeapify(arr,i)
L = left(i)
R = right(i)
In heap sort, basically, there are two phases involved in the sorting of elements. By using the heap
sort algorithm, they are as follows -
• The first step includes the creation of a heap by adjusting the elements of the array.
• After the creation of heap, now remove the root element of the heap repeatedly by shifting
it to the end of the array, and then store the heap structure with the remaining elements.
First, we have to construct a heap from the given array and convert it into max heap.
After converting the given heap into max heap, the array elements are -
Next, we have to delete the root element (89) from the max heap. To delete this node, we have to
swap it with the last node, i.e. (11). After deleting the root element, we again have to heapify it to
convert it into max heap.
After swapping the array element 89 with 11, and converting the heap into max-heap, the elements
of array are -
In the next step, again, we have to delete the root element (81) from the max heap. To delete this
node, we have to swap it with the last node, i.e. (54). After deleting the root element, we again have
to heapify it to convert it into max heap.
After swapping the array element 81 with 54 and converting the heap into max-heap, the elements of
array are -
In the next step, we have to delete the root element (76) from the max heap again. To delete this
node, we have to swap it with the last node, i.e. (9). After deleting the root element, we again have
to heapify it to convert it into max heap.
After swapping the array element 76 with 9 and converting the heap into max-heap, the elements of
array are -
In the next step, again we have to delete the root element (54) from the max heap. To delete this
node, we have to swap it with the last node, i.e. (14). After deleting the root element, we again have
to heapify it to convert it into max heap.
After swapping the array element 54 with 14 and converting the heap into max-heap, the elements of
array are -
In the next step, again we have to delete the root element (22) from the max heap. To delete this
node, we have to swap it with the last node, i.e. (11). After deleting the root element, we again have
to heapify it to convert it into max heap.
After swapping the array element 22 with 11 and converting the heap into max-heap, the elements of
array are -
In the next step, again we have to delete the root element (14) from the max heap. To delete this
node, we have to swap it with the last node, i.e. (9). After deleting the root element, we again have
to heapify it to convert it into max heap.
After swapping the array element 14 with 9 and converting the heap into max-heap, the elements of
array are -
In the next step, again we have to delete the root element (11) from the max heap. To delete this
node, we have to swap it with the last node, i.e. (9). After deleting the root element, we again have
to heapify it to convert it into max heap.
After swapping the array element 11 with 9, the elements of array are -
Now, heap has only one element left. After deleting it, heap will be empty.
Time Complexity
Best Case Complexity - It occurs when there is no sorting required, i.e. the array is already sorted. The
best-case time complexity of heap sort is O(n logn).
Average Case Complexity - It occurs when the array elements are in jumbled order that is not properly
ascending and not properly descending. The average case time complexity of heap sort is O(n log n).
Worst Case Complexity - It occurs when the array elements are required to be sorted in reverse order.
That means suppose you have to sort the array elements in ascending order, but its elements are in
descending order. The worst-case time complexity of heap sort is O(n log n).
The time complexity of heap sort is O(n log n) in all three cases (best case, average case, and worst
case). The height of a complete binary tree having n elements is logn.
Space Complexity
The space complexity of Heap sort is O(1) and it's not stable.