Preparation: Reread section 10.2.1 of the textbook, which discusses the insertion sort in the context of both lists and vectors.
Basic Approach: The merge sort algorithm is a particularly effective method for sorting large data sets which are stored either in lists or in vectors. While the algorithm requires some care and attention to detail, the basic approach depends upon a few simple ideas:
Merging Two Ordered Lists: Given two ordered lists, ls-1 and ls-2, a natural recursive algorithm merges them into a new ordered list. The following code follows the merge procedure in Springer and Friedman's book:
(define merge
(lambda (ls-1 ls-2)
(cond ((null? ls-1) ls-2)
((null? ls-2) ls-1)
((< (car ls-1) (car ls-2))
(cons (car ls-1) (merge (cdr ls-1) ls-2)))
(else (cons (car ls-2) (merge ls-1 (cdr ls-2))))
)
)
)
Use trace-define, rather than define, for this procedure, and trace the algorithm for the two lists (3 4 6) and (2 4 7). For each recursive call to merge, identify the parameters, and indicate which condition within cond is followed. What happens when the first elements on the two lists are equal? (Why?) What base case is used for this example?
(Change the declaration of merge back to define before going to the next part.)
( (1 3 6) (2 7) (2 3 8 9) (0 5 9) (1 4 6 8) (0 4 5) )
We can get a smaller number of longer ordered lists by merging the first two lists, then the next two lists, and so on, until all the small ordered lists have been processed. This is done in the pair-merge procedure, which is program 10.8 in Springer and Friedman's book.
(define pair-merge
(lambda (sublists)
(cond ((null? sublists) '())
((null? (cdr sublists)) sublists)
(else (cons (merge (car sublists) (cadr sublists))
(pair-merge (cddr sublists))))
)
)
)
(pair-merge '( (1 3 6) (2 7) (2 3 8 9) (0 5 9) (1 4 6 8) ))How is the last small list (e.g., (1 4 6 8)) handled? Explain.
( 1 3 6 2 7 2 3 8 9 0 5 9 1 4 6 8 0 4 5)this process yields
( (1 3 6) (2 7) (2 3 8 9) (0 5 9) (1 4 6 8) (0 4 5) )Springer and Friedman's make-groups procedure (program 10.7) follows this approach:
(define make-groups
(lambda (ls)
(cond ((null? ls) '())
((null? (cdr ls)) (list ls))
(else (let ((a (car ls))
(gps (make-groups (cdr ls))))
(if (< (cadr ls) a)
(cons (list a) gps)
(cons (cons a (car gps)) (cdr gps))
)
)
)
)
)
)
(define nat-mergesort
(lambda (ls)
(if (null? ls)
'()
(let sort ((gps (make-groups ls)))
(if (null? (cdr gps))
(car gps)
(sort (pair-merge gps))
)
)
)
)
)
The corresponding code follows the same approach as for lists, with two
major changes:
(define vector-merge!
(lambda (newvec vec left group-size vec-size)
(let* ((top-left (min vec-size (+ left group-size)))
(right top-left)
(top-right (min vec-size (+ right group-size))))
(let mergeloop ((left left) (right right) (i left))
(cond ((and (< left top-left) (< right top-right))
(if (< (vector-ref vec left) (vector-ref vec right))
(begin
(vector-set! newvec i (vector-ref vec left))
(mergeloop (add1 left) right (add1 i)))
(begin
(vector-set! newvec i (vector-ref vec right))
(mergeloop left (add1 right) (add1 i)))))
((< left top-left)
(vector-set! newvec i (vector-ref vec left))
(mergeloop (add1 left) right (add1 i)))
((< right top-right)
(vector-set! newvec i (vector-ref vec right))
(mergeloop left (add1 right) (add1 i))))))))
Also, since merging involves copying from one vector to another, we begin by creating a second vector. Then we repeatedly merge from one vector to the other. Some care is needed to keep track of which vector has the current data set. This is accomplished with a count variable which is even or odd according to which vector has the current ordered groups.
(define vector-mergesort!
(lambda (orig-vec)
(let* ((vec-size (vector-length orig-vec))
(new-vec (make-vector vec-size)))
;; merge with successively larger group sizes
(do ((group-size 1 (* group-size 2)) ;; loop variables
(twice-size 2 (* twice-size 2))
(count 1 (add1 count))
(vec1 orig-vec vec2)
(vec2 new-vec vec1))
((>= group-size vec-size) ;;; exit condition
(if (even? count) ;;; copy to orig-vec, if needed
(do ((i 0 (add1 i))) ;;; this do replaces
((>= i vec-size)) ;;; vector-change!
(vector-set! orig-vec i (vector-ref new-vec i)))))
;; successively merge next two groups
(do ((left 0 (+ left twice-size))) ;; loop variables
((>= left vec-size)) ;; exit when array processed
(vector-merge! vec2 vec1 left group-size vec-size))))))
(display count) (newline) (display left) (newline) (display vec1) (newline) (display vec2) (newline)[Thus, these lines come before the last five right parentheses in the procedure.]
Now run the procedure with the following lines:
(define data #(3 1 4 1 5 9 2 6 5 3)) data (vector-mergesort! data) dataThese lines added allow us to trace the updating of data in each vector immediately after each call to vector-merge! Print out the results of this test run, and explain what processing has occured at each step.
This document is available on the World Wide Web as
http://www.math.grin.edu/~walker/courses/151/lab-merge-sort.html