Laboratory Exercises For Computer Science 152

Sorting with Divide-and-Conquer Algorithms

Summary: This laboratory exercise introduces and analyzes the quicksort as a further mechanism to order numbers in an array. This algorithm illustrates a powerful approach to problem solving, called divide and conquer.

Quicksort

Initial Notes:


The quicksort is a recursive approach to sorting, and we begin by outlining the principal recursive step. In this step, we make a guess at the value that should end up in the middle of the array. In particular, given the array a[0], ..., a[N-1] of data, arranged at random, then we might guess that the first data item a[0] often should end up in about the middle of the array when the array is finally ordered. (a[0] is easy to locate, and it is as good a guess at the median value as another.) This suggests the following steps:

  1. Rearrange the data in the a array, so that A[0] is moved to its proper position. In other words, move a[0] to a[mid] and rearrange the other elements so that:
    a[0], a[1], ..., a[mid-1] < a[mid]
    and
    a[mid} < a[mid+1], ..., a[N-1].


  2. Repeat this process on the smaller lists
    a[0], a[1], ..., a[mid-1]
    and
    a[mid+1], ..., a[N-1].

A specific example is shown below:


A:  Main Steps in a Quicksort


Outline to Move the First Array Element to the Appropriate Middle

With the above outline, we now consider how to move the first array element into its appropriate location in the array. The basic approach is to work from the ends of the array toward the middle, comparing data elements to the first element and rearranging the array as necessary. The outline details follow:

  1. Compare a[first] to a[last], a[last-1], etc. until an element a[right] is found where a[right] < a[first].

  2. Compare a[first] to a[first+1], a[first+2], etc. until an element a[left] is found where a[left] > a[first].

  3. Swap a[left] and a[right].
    At this point,

  4. Continue steps A and B, comparing the original first element against the end of the arrays, until all elements of the array have been checked.

  5. Swap a[first] with a[right], to put it in its correct location.

These steps are illustrated in the following diagram:


A:  Putting the First Array Element in its Place

Another example may be found in Section 5.5 of the textbook.


The following code implements this basic step:


int left=first+1;
int right=last;
int temp;

while (right >= left) {
    // search left to find small array item
    while ((right >= left) && (a[first] <= a[right]))
        right--;
    // search right to find large array item
    while ((right >= left) && (a[first] >= a[left]))
        left++;
    // swap large left item and small right item, if needed
    if (right > left) {
        temp = a[left];
        a[left] = a[right];
        a[right] = temp;
    }
}
// put a[first] in its place
temp = a[first];
a[first] = a[right];
a[right] = temp;
  1. Review the steps in the above code for the array segments above. Check that the position of the array elements and index variables match the diagram exactly, and explain how the variables are set.

  2. Repeat the above for an array segment in which all elements are already in ascending order. Again, explain what happens.

  3. Next, repeat the steps for the above code for an array segment in which all elements are in descending order. Again, explain what happens.

  4. The code swaps a[left] and a[right] only if right > left. Given an example to show how the code could fail if the swap were done without the test for right > left.

  5. Explain why each loop contains the condition right >= left. Given an example where the code would fail if each right >= left test were omitted.

Given the above code to place the first element of an array segment appropriately and rearrange small and large items, the full array may be sorted by applying the algorithm recursively to the first part and the last part of the array. The base case of the recursion arises if there are no further elements in an array segment to sort.

This gives rise the the following code, called a quicksort.


public static void quicksort (int [] a) {
// method to sort using the quicksort
    quicksortKernel (a, 0, a.length-1);
}

private static void quicksortKernel (int [] a, int first, int last) {
    int left=first+1;
    int right=last;
    int temp;

    while (right >= left) {
        // search left to find small array item
        while ((right >= left) && (a[first] <= a[right]))
            right--;
        // search right to find large array item
        while ((right >= left) && (a[first] >= a[left]))
            left++;
        // swap large left item and small right item, if needed
        if (right > left) {
            temp = a[left];
            a[left] = a[right];
            a[right] = temp;
        }
    }
    // put a[first] in its place
    temp = a[first];
    a[first] = a[right];
    a[right] = temp;

    // recursively apply algorithm to a[first]..a[right-1] 
    // and a[right+1]..a[last], provided these segments contain >= 2 items
    if (first < right-1)
        quicksortKernel (a, first, right-1);
    if (right+1 < last)
        quicksortKernel (a, right+1, last);   
}

This code illustrates the husk-and-kernel programming style which arose frequently in CSC 151.

  1. Write a few sentences explaining how this code works. What does the husk procedure do? Why is the husk needed?

  2. How is the base case handled in this recursive algorithm?

  3. Program ~walker/152/java-examples/SortShell4.java provides a shell for testing the quicksort. Copy this file to your account, compile it, and run it with several sets of test data.

  4. Modify quicksort and/or quicksortKernel, so that the array is sorted in descending order rather than ascending order. (After testing your revised program, change the code back to its original form for use in the rest of this lab.)

Analysis and Timing

The quicksort is called a divide-and-conquer algorithm, because the first step normally divides the array into two pieces and the approach is applied recursively to each piece.

Suppose this code is applied an array containing n randomly ordered data. For the most part, we might expect that the quicksort's divide-and-conquer strategy will divide the array into two pieces, each of size n/2, after the first main step. Applying the algorithm to each half, in turn, will divide the array further -- roughly 4 pieces, each of size n/4. Continuing a third time, we would expect to get about 8 pieces, each of size n/8.

Applying this process i times, would would expect to get about 2i pieces, each of size n/2i.

This process continues until each array piece just has 1 element in it, so 1 = n/2i or 2i = n or i = log2 n. Thus, the total number of main steps for this algorithms should be about log2 n. For each main step, we must examine the various array elements to move the relevant first items of each array segment into their correct places, and this requires us to examine roughly n items.

Altogether, for random data, this suggests that the quicksort requires about log2 n main steps with n operations per main step. Combining these results, quicksort on random data has O(n log2 n).

  1. While the quicksort has O(n log2 n) on random data, estimate the amount of work required for data in either ascending or descending order. What is the order of the quicksort in these worst cases?

  2. Suppose the algorithm requires t milliseconds to sort N items. Using the order analysis you just gave, how long would you expect the algorithm would take to support 2N items. Briefly explain your answer.

  3. Following the approach in the previous lab, insert code to time the quicksort in milliseconds.

  4. Again, following the approach of the previous lab, generate 20,000 numbers in random order, and run the quicksort several times. Record your times.

  5. Next generate 40,000 numbers in random order, and run the insertion sort again several times. How do the times for 20,000 and 40,000 compare? Relate your timings with the analysis you did in step 10.

  6. Generate 1,000 numbers in descending order for c, and run the insertion sort several times. Again record your times.

  7. Now generate 2,000 numbers in descending order, run the program several times, and compare the results. Again, relate your results to your predictions using order analysis in part 10.

  8. Next, predict the time required by the insertion sort to sort 1,000 numbers in descending order. Run your program for this size data set to obtain timings. Briefly explain your results.

Both the quicksort of this lab and the insertion sort of the previous lab are sometimes described as non-stable sorting algorithms. That is, they can be extremely efficient for some types of data, but their efficiency decreases dramatically for some data sets.


This document is available on the World Wide Web as

http://www.math.grin.edu/~walker/courses/152.sp01/lab-n-log-n-sorts.html

created March 4, 2001
last revised March 6, 2001