Free Trial

Safari Books Online is a digital library providing on-demand subscription access to thousands of learning resources.

Share this Page URL

Chapter 4. Sorting Algorithms > Criteria for Choosing a Sorting Algorithm - Pg. 99

Note that with 17,576 buckets, H ASH S ORT outperforms Q UICKSORT for n> 8,192 items (and this trend continues with increasing n). However, with only 676 buckets, once n>32,768 (for an average of 48 elements per bucket), H ASH S ORT begins its inevitable slowdown with the accumulated cost of executing I NSER- TION S ORT on increasingly larger sets. Indeed, with only 26 buckets, once n>256, H ASH S ORT begins to quadruple its performance as the problem size doubles, showing how too few buckets leads to O(n 2 ) performance. Criteria for Choosing a Sorting Algorithm To choose a sorting algorithm, consider the qualitative criteria in Table 4-6. These may help your initial decision, but you likely will need more quantitative measures to guide you. Table 4-6. Criteria for choosing a sorting algorithm Criteria Only a few items Items are mostly sorted already Concerned about worst-case scenarios Interested in a good average-case result Items are drawn from a dense universe Desire to write as little code as possible Sorting algorithm I NSERTION S ORT I NSERTION S ORT H EAP S ORT Q UICKSORT B UCKET S ORT I NSERTION S ORT Sorting Algorithms To choose the appropriate algorithm for different data, you need to know some properties about your input data. We created several benchmark data sets on which to show how the algorithms presented in this chapter compare with one another. Note that the actual values of the generated tables are less important because they reflect the specific hardware on which the benchmarks were run. Instead, you should pay attention to the relative performance of the algorithms on the corresponding data sets: Random strings Throughout this chapter, we have demonstrated performance of sorting algo- rithms when sorting 26-character strings that are permutations of the letters in the alphabet. Given there are n! such strings, or roughly 4.03*10 26 strings, there are few duplicate strings in our sample data sets. In addition, the cost of comparing elements is not constant, because of the occasional need to compare multiple characters. Double precision floating-point values Using available pseudorandom generators available on most operating systems, we generate a set of random numbers from the range [0,1). There are essentially no duplicate values in the sample data set and the cost of comparing two elements is a fixed constant. The input data provided to the sorting algorithms can be preprocessed to ensure some of the following properties (not all are compatible): Criteria for Choosing a Sorting Algorithm | 99