Laboratory Exercises For Computer Science 151

Merge Sort

Merge Sort

Goals: This laboratory exercise explores the merge sort algorithm.

Preparation: Reread section 10.2.1 of the textbook, which discusses the insertion sort in the context of both lists and vectors.

Basic Approach: The merge sort algorithm is a particularly effective method for sorting large data sets which are stored either in lists or in vectors. While the algorithm requires some care and attention to detail, the basic approach depends upon a few simple ideas:

  1. The merging of two ordered lists to get a composite ordered list is both straightforward and efficient.

  2. If one starts with a collection of small ordered lists, then one can repeatedly merge the small ones into bigger lists until all items are merged together onto one final, ordered list.

  3. Any list of elements can be divided into small pieces (with one or more elements per piece), where each piece is ordered.

The first part of this lab examines each of these ideas in some detail in the context of general lists of numbers. The second part of this lab applies the same ideas to numbers stored within a vector.

Merging Two Ordered Lists: Given two ordered lists, ls-1 and ls-2, a natural recursive algorithm merges them into a new ordered list. The following code follows the merge procedure in Springer and Friedman's book:


(define merge
   (lambda (ls-1 ls-2)
       (cond ((null? ls-1) ls-2)
             ((null? ls-2) ls-1)
             ((< (car ls-1) (car ls-2))
                    (cons (car ls-1) (merge (cdr ls-1) ls-2)))
             (else  (cons (car ls-2) (merge ls-1 (cdr ls-2))))
       )
   )
)
  1. Describe in words the base cases for this recursion.

  2. One might describe the recursive case for this algorithm as follows: The first numbers of each list are compared and the smaller number is identified. This small number then is placed at the beginning of what is obtained when the rest of that list merged with the other list.

    Use trace-define, rather than define, for this procedure, and trace the algorithm for the two lists (3 4 6) and (2 4 7). For each recursive call to merge, identify the parameters, and indicate which condition within cond is followed. What happens when the first elements on the two lists are equal? (Why?) What base case is used for this example?

    (Change the declaration of merge back to define before going to the next part.)

Merge Many Small Lists Into Bigger Lists: Now suppose we have a list of many small, ordered lists, such as

( (1 3 6) (2 7) (2 3 8 9) (0 5 9) (1 4 6 8) (0 4 5) )

We can get a smaller number of longer ordered lists by merging the first two lists, then the next two lists, and so on, until all the small ordered lists have been processed. This is done in the pair-merge procedure, which is program 10.8 in Springer and Friedman's book.


(define pair-merge
   (lambda (sublists)
       (cond ((null? sublists) '())
             ((null? (cdr sublists)) sublists)
             (else (cons (merge (car sublists) (cadr sublists))
                         (pair-merge (cddr sublists))))
       )
   )
)
  1. Test this procedure on the above list of small, ordered lists. What is the result?

  2. The previous test involved an even number of small ordered lists. What happens if the last of those small lists is omitted? That is, describe the result of running
    
    (pair-merge '( (1 3 6) (2 7) (2 3 8 9) (0 5 9) (1 4 6 8) ))
    
    How is the last small list (e.g., (1 4 6 8)) handled? Explain.

  3. Write three or four sentences explaining the base cases and the recursive cases in this recursive procedure.

Generating Small, Ordered Lists: Any list of numbers can be divided into small, ordered lists: start with the first number and continue placing elements on a small list as long as the numbers are increasing; start a second small list once the first run is completed; etc. For example, given an initial list

( 1 3 6 2 7 2 3 8 9 0 5 9 1 4 6 8 0 4 5)
this process yields

( (1 3 6) (2 7) (2 3 8 9) (0 5 9) (1 4 6 8) (0 4 5) )
Springer and Friedman's make-groups procedure (program 10.7) follows this approach:

(define make-groups
   (lambda (ls)
       (cond ((null? ls) '())
             ((null? (cdr ls)) (list ls))
             (else (let ((a (car ls))
                         (gps (make-groups (cdr ls))))
                      (if (< (cadr ls) a)
                          (cons (list a) gps)
                          (cons (cons a (car gps)) (cdr gps))
                      )
                )
           )
       )
   )
)
  1. Run make-groups on the above list to check it produces the list of small lists as claimed.

  2. Describe each base case in this recursion.

  3. The recursive call in this procedure occurs as part of the bindings within the let statement. Describe in detail how the if statement puts the appropriate pieces together in the recursive case.
Merge Sort For Lists: The full merge sort algorithm for lists of numbers puts the above pieces together. The initial list is divided into a list of small ordered lists. Then these small lists are repeatedly merged until only one large, ordered list remains. The following code rewrites program 10.9 in Springer and Friedman's text by replacing the letrec with a named let construction.

(define nat-mergesort
   (lambda (ls)
       (if (null? ls)
           '()
           (let sort ((gps (make-groups ls)))
                (if (null? (cdr gps))
                    (car gps)
                    (sort (pair-merge gps))
                )
           )
       )
   )
)
  1. What can go wrong if the initial test for a null list is omitted?

  2. Explain how this code merges successive pairs to obtain a completely ordered list.

  3. Why does this procedure return (car gps) rather than gps in the base case of the recursion?

  4. Rewrite this code by replacing the let statement by a do expression.
    Hint: The resulting code can be much simpler; the body of the do expression can be null.

Merge Sort For Vectors: The merge sort for vectors follows a similar approach, with a few technical changes:
  1. At the start, we consider each element to be an ordered list of length 1. This eliminates the need for the make-groups procedure.

  2. Since all initial ordered lists have length 1, the first merge will produce lists of length 2, the next merged lists will have length 4, etc. A variable group-size can be used to keep track of this uniform length.

  3. When the group-size is known, we can determine where each small ordered list begins and ends by simple arithmetic.

  4. When small ordered lists are located in one vector, the merging process could overwrite some of these values if the results were placed in the same vector. Thus, in merging groups of vector elements, we take elements from one vector and store the result in a second vector.

  5. Since the initial vector may be of any length, the last group may not contain exactly the group-size number of elements. We always will need to be careful not to try to access a position outside the vector.

Procedure vector-merge! merges short groups of data within vectors just as merge combines two groups for lists. To be more precise, vector-merge! processes two groups of data within the vector vec, beginning at position left. Parameter group-size indicates the size of each piece, and vec-size gives the size of the vector. The two groups are merged into the corresponding locations within vector newvec. Schematically, this is shown in the following diagram. The corresponding code follows the same approach as for lists, with two major changes: The following version is based loosely on program 10.10 of Springer and Friedman.

(define vector-merge!
   (lambda (newvec vec left group-size vec-size)
      (let* ((top-left (min vec-size (+ left group-size)))
             (right top-left)
             (top-right (min vec-size (+ right group-size))))
         (let mergeloop ((left left) (right right) (i left))
              (cond ((and (< left top-left) (< right top-right))
                        (if (< (vector-ref vec left) (vector-ref vec right))
                           (begin
                              (vector-set! newvec i (vector-ref vec left)) 
                              (mergeloop (add1 left) right (add1 i)))
                           (begin
                              (vector-set! newvec i (vector-ref vec right)) 
                              (mergeloop left (add1 right) (add1 i)))))
                    ((< left top-left)
                        (vector-set! newvec i (vector-ref vec left))
                        (mergeloop (add1 left) right (add1 i)))
                    ((< right top-right)
                        (vector-set! newvec i (vector-ref vec right))
                        (mergeloop left (add1 right) (add1 i))))))))
  1. Add comments to the above code to explain how two sections of one vector are merged into the second vector.

  2. State in words each of the cases which must be considered. Then illustrate each of these cases on an appropriate diagram, based on the one above.

Turning to the full merge sort algorithm for vectors, the following procedure combines pair-merge and nat-mergesort, with one do expression for each. For example, the inner do performs the role of pair-merge. The variable left indicates where the first of two groups start, and the inner do moves from group to group from the start of the vector to the end.

Also, since merging involves copying from one vector to another, we begin by creating a second vector. Then we repeatedly merge from one vector to the other. Some care is needed to keep track of which vector has the current data set. This is accomplished with a count variable which is even or odd according to which vector has the current ordered groups.


(define vector-mergesort!
   (lambda (orig-vec)
      (let* ((vec-size (vector-length orig-vec))
             (new-vec (make-vector vec-size)))
        ;; merge with successively larger group sizes
        (do ((group-size 1 (* group-size 2))    ;; loop variables
             (twice-size 2 (* twice-size 2))
             (count 1 (add1 count))
             (vec1 orig-vec vec2)
             (vec2 new-vec vec1))
            ((>= group-size vec-size)          ;;; exit condition
                (if (even? count)              ;;; copy to orig-vec, if needed
                        (do ((i 0 (add1 i)))   ;;; this do replaces 
                            ((>= i vec-size))  ;;; vector-change!
                            (vector-set! orig-vec i (vector-ref new-vec i)))))
            ;; successively merge next two groups
            (do ((left 0 (+ left twice-size)))    ;; loop variables
                ((>= left vec-size))              ;; exit when array processed
                (vector-merge! vec2 vec1 left group-size vec-size))))))
  1. The above commentary outlines the purpose of the inner do expression. Write a similar commentary for the outer do expression.

  2. Add the following lines immediately after the call to vector-merge!, still within the inner do.
    
    (display count) (newline)
    (display left) (newline)
    (display vec1) (newline)
    (display vec2) (newline)
    
    [Thus, these lines come before the last five right parentheses in the procedure.]

    Now run the procedure with the following lines:

    
    (define data #(3 1 4 1 5 9 2 6 5 3))
    data
    (vector-mergesort! data)
    data
    
    These lines added allow us to trace the updating of data in each vector immediately after each call to vector-merge! Print out the results of this test run, and explain what processing has occured at each step.

This document is available on the World Wide Web as

http://www.math.grin.edu/~walker/courses/151/lab-merge-sort.html

created April 16, 1997
last revised April 25, 1997