Sort Comparisons

selection_sortThis entry implements a class to compare a couple (more to come in the next few days) sorting algorithms. The Java code is based on examples from the Algorithms fourth edition book by Robert Sedgewick and Kevin Wayne.

As you might already know, both Selection and Insertion sorts have a complexity of O(n^2). That said; in practice Insertion seems to be faster than Selection. If you take a look at the implementations, it is easy to insertion_sortdetermine that both algorithms use nested loops, but the number of cycles in the inner loops is dependent on the initial order of elements in the array for one of the sorts.

In previous entries in this blog, we have looked at time comparisons between Selection and Insertion sorts. The code in this blog will grow to include tests for: Bubble, Heap, Insertion, Merge, Quick, Radix, Selection and Shell sort algorithms. At this time it only includes Insertion and Selection.

Following is output from the console on the Eclipse IDE:

main <<< alg1 ==>Insertion<==

main <<< alg2 ==>Selection<==

main <<<    N: 1000

main <<<    T: 100

main <<< Insertion is 0.42 times faster than Selection

The software generates in this case 1,000 random values in the range <0.0 : 1.0>. The values are use to populate an array. The Insertion algorithm is called to sort 100 times different instances of the array. If the sorted array is used, the performance difference is much larger. After processing 100 instances of different arrays, the program repeats the process using the Selection sorting algorithm. The time comparison between them is then displayed. By the way, test parameter constants are included in the source code. This is done for efficiency when using an IDE. Generating a JAR to test tends to be slower lending to reduce experimentation.

The Java test code follows:

package john.canessa.sort;

import java.util.Random;

public class SortCompare {

/*

*

*/

public static double time(String alg, Double[] a) throws Exception {

double begin = System.currentTimeMillis();

switch (alg) {

case “Insertion”:

Insertion.sort(a);

break;

case “Selection”:

Selection.sort(a);

break;

default:

throw new Exception(“unknown alg ==>” + alg + “<==”);

}

double end    = System.currentTimeMillis();

return end – begin;

}

/*

* for specified algorithm sort N random values T times

*/

public static double timeRandomInput(String alg, int N, int T) throws Exception {

double totalTime     = 0.0;

Double[] a                 = new Double[N];

Random rand          = new Random();

for (int t = 0; t < T; t++) {

for (int n = 0; n < N; n++) {

a[n] = rand.nextDouble();

}

totalTime += time(alg, a);

}

return totalTime;

}

/*

* test code

*/

public static void main(String[] args) throws Exception {

final String alg1    = “Insertion”;

final String alg2    = “Selection”;

final int N                = 1000;

final int T                = 100;

System.out.println(“main <<< alg1 ==>” + alg1 + “<==”);

System.out.println(“main <<< alg2 ==>” + alg2 + “<==”);

System.out.println(“main <<<    N: ” + N);

System.out.println(“main <<<    T: ” + T);

double t1     = timeRandomInput(alg1, N, T);

double t2     = timeRandomInput(alg2, N, T);

System.out.printf(“main <<< %s is %.2f times faster than %s\n”, alg1, t1 / t2, alg2);

}

}

Most developers tend to think that exclusively using Quick sort is the way to go. Perhaps it is the easiest way to go. Depending on the characteristics of the data (e.g., duplicates, initial order, number of elements), sometimes other algorithms may outperform it.

The authors of the Algorithm books provide the following advice:

“Full development of such results [sorting algorithms in this case] is reserved for experts studying our most important algorithms, but every programmer using an algorithm should be aware of the scientific context [using the scientific method] underlying its performance properties”.

sort_comparison_traceTo place the above statement in context I would suggest getting, reading the book and experimenting with the code. Such approach seems to be the best way to learn, refresh and fully understand any technical or scientific subject. I tend to forget the actual algorithms and their performance. But when needed, like riding a bicycle, all comes back. That is why books and the Internet are so valuable. The issue is that if you have no idea of how sorting (or any other algorithm for that matter) works, it will take considerably longer time to learn and understand and be efficient in their use in actual production code. Be proactive, refresh and learn constantly.

If you have comments or questions on this or any other blog, please send me a message via email.

John

john.canessa@gmail.co

Leave a Reply

Your email address will not be published. Required fields are marked *