The insertion sort algorithm is not always the best one to use. When sorting n values, starting from a random initial arrangement, the insertion sort has to look through half of the elements in the sorted part of the data structure to find the correct insertion point for each new value it places. The size of that sorted part increases linearly from 0 to n, so its average size is n/2 and the average number of comparisons needed to insert one element is n/4. Taking all the insertions together, then, the insertion sort performs about n2/4 comparisons to sort the entire set.
Accordingly, when the number of values to be sorted is large (greater than one hundred, say), it is preferable to use a sorting method that is more complicated to set up initially but performs fewer operations on each element in the process of positioning it correctly. The merge sort algorithm is often an appropriate choice. We shall examine two variants of this algorithm -- one taking a list of values and returning a newly allocated list containing the same values, but in sorted order, and the other one taking a vector of values and arranging its elements as a ``destructive'' side effect.
The fundamental building block for the first version of the merge sort is
the merge procedure, which takes two sorted lists as arguments
and returns a sorted list containing all of the values from both argument
lists. Program 4.3 on page 98 of our textbook is an implementation of this
procedure. Since the lists to be merged will sometimes be quite large, I
prefer the following tail-recursive implementation:
(define merge
(lambda (ls-1 ls-2)
(let loop ((rest-1 ls-1)
(rest-2 ls-2)
(merged '()))
(cond ((null? rest-1) (revappend merged rest-2))
((null? rest-2) (revappend merged rest-1))
(else (let ((first-1 (car rest-1))
(first-2 (car rest-2)))
(if (< first-2 first-1)
(loop rest-1 (cdr rest-2) (cons first-2 merged))
(loop (cdr rest-1) rest-2 (cons first-1 merged)))))))))
(define revappend
(lambda (ls-1 ls-2)
(if (null? ls-1)
ls-2
(revappend (cdr ls-1) (cons (car ls-1) ls-2)))))
The idea is to build up the result list merged by repeatedly
taking either the next available value from ls-1 or the next
available value from ls-2, whichever comes first, and
attaching it to merged. In the recursive call, whichever value
was thus selected and transferred is eliminated from further consideration.
The process stops as soon as one of the original lists is exhausted; at
that point, the remaining values in the other list (if any) are appended to
the merged list. (This is done with revappend
rather than append, because merged is built up in
the wrong order during the tail recursion -- the first value recovered from
ls-1 or ls-2 becomes the last element of
merged, and so on.)
Use merge to merge the lists (2 3 4 7 8 10 12)
and (1 6 11 13 14).
What happens if merge is applied to lists that are not already
in ascending order?
What happens if merge is given two empty lists as arguments?
The merge sort works by starting with short lists, merging them to form
somewhat longer lists, merging the somewhat longer lists to form still
longer ones, and so on until only one list remains -- the result of the
final merge operation. The short lists that we begin with must satisfy the
precondition for the merge procedure: they must already be
sorted. The easiest way to ensure this without having to perform any
comparisons is to put just one value into each short list; a single value
is already ``sorted,'' in a trivial or vacuous sense (and the
merge procedure will work correctly on such single-element
lists).
We'll keep all of the lists that will eventually be merged together in one
master list of lists. If ls is the list to be sorted, then,
the master list will initially be (map list ls) -- a list of
one-element lists, each containing one element of ls.
The following procedure, pair-merge, then traverses this
master list, taking its lists two at a time, merging each pair and adding
the merged pair to a new master list, which it returns. (If the master
list contains an odd number of lists, pair-merge leaves the
last one alone.)
(define pair-merge
(lambda (master-list)
(let loop ((sublists master-list)
(new-master-list '()))
(cond ((null? sublists) new-master-list)
((null? (cdr sublists)) (cons (car sublists) new-master-list))
(else (loop (cddr sublists)
(cons (merge (car sublists) (cadr sublists))
new-master-list)))))))
The full merge-sort procedure simply calls
pair-merge repeatedly until all of the lists have been
combined into one. (The empty list must be handled as a special case,
because it is impossible to ``combine'' zero lists into one.)
(define merge-sort
(lambda (ls)
(if (null? ls)
'()
(do ((master-list (map list ls) (pair-merge master-list)))
((null? (cdr master-list)) (car master-list))))))
Like the insertion sort, the merge sort
algorithm can be made more generally applicable by making the ordering rule
a parameter of the (curried) procedure, so the final implementation of this
first version of merge sort, with the merge and
pair-merge procedures embedded, looks like this:
(define merge-sort
(lambda (precedes?)
(define merge
(lambda (ls-1 ls-2)
(let loop ((rest-1 ls-1)
(rest-2 ls-2)
(merged '()))
(cond ((null? rest-1) (revappend merged rest-2))
((null? rest-2) (revappend merged rest-1))
(else (let ((first-1 (car rest-1))
(first-2 (car rest-2)))
(if (precedes? first-2 first-1)
(loop rest-1 (cdr rest-2)
(cons first-2 merged))
(loop (cdr rest-1) rest-2
(cons first-1 merged)))))))))
(define pair-merge
(lambda (master-list)
(let loop ((sublists master-list)
(new-master-list '()))
(cond ((null? sublists) new-master-list)
((null? (cdr sublists))
(cons (car sublists) new-master-list))
(else (loop (cddr sublists)
(cons (merge (car sublists) (cadr sublists))
new-master-list)))))))
(lambda (ls)
(if (null? ls)
'()
(do ((master-list (map list ls) (pair-merge master-list)))
((null? (cdr master-list)) (car master-list)))))))
Again, I have left revappend as a separate, non-embedded
procedure because it is often useful even outside the
merge-sort procedure. The one change in the code for the
embedded procedures is that the call to the numerical comparison procedure
< has been replaced with a call to precedes?.
Suppose, now, that the values to be sorted arrive in the form of a vector, and that the objective is to revise the contents of the vector so that at the end of the sorting procedure the same elements are present but arranged in the order specified by the comparison rule. As in the constructive version of the algorithm, we want to work our way up from single-element subvectors. Instead of allocating a separate vector for each single element, however, we can take advantage of constant-time access to the elements of a vector by making the separation purely notional: We can identify a sub-vector of the original vector by keeping track of the positions at which it begins and ends.
I'll adopt the convention that the starting position of a
subvector is the position of the first element that is inside the
subvector, and the ending position is the position of the first
element after and outside of the subvector (or, if there is no
such element, the length of the entire vector). So, within the vector
#(a b c d e), the subvector with elements b and
c has starting position 1 and ending position 3. (This
convention has the advantage that the number of elements in the subvector
is the difference between the ending position and the starting position.
The arguments to Scheme's substring procedure are required to
follow the same convention, for the same reason.)
The merge! procedure takes two adjacent subvectors
of the same vector, both of which must already be sorted, and overwrites
them with the merged (and therefore sorted) version of their elements.
Unfortunately, this cannot be done completely ``in place''; there must be
a ``holding area'' that provides separate storage for the elements as they
are merged, and at the end of the merging process the elements have to be
copied back from the holding area into the original vector. In this
implementation, the holding area takes the form of a second vector, of the
same size as the original. As the two adjacent subvectors of
the original vector are merged, they are placed into the positions of the
holding vector that they will eventually occupy in the original vector;
at the end of the merge, they are copied.
Define a Scheme procedure copy-subvector! that takes four
arguments -- a source vector source, the starting position
start and ending position end of a subvector of
source, and a target vector target of the same
size as source -- and copies the specified subvector of
source into the corresponding positions in
target, using vector-set!.
> (define vector-1 (vector 'alpha 'beta 'gamma 'delta 'epsilon)) > (define vector-2 (vector 'first 'second 'third 'fourth 'fifth)) > (copy-subvector! vector-1 1 3 vector-2) > vector-2 #(first beta gamma fourth fifth) > vector-1 #(alpha beta gamma delta epsilon)
The arguments to the merge! procedure are the starting
positions of the two subvectors and the ending position of the second one.
(Since they are supposed to be adjacent, the ending position of the first is
assumed to be identical with the starting position of the second.)
(define merge!
(lambda (start-1 start-2 end-2)
(let loop ((target start-1)
(current-1 start-1)
(current-2 start-2))
(cond ((= current-1 start-2)
(copy-subvector! holding start-1 current-2 vec))
((or (= current-2 end-2)
(precedes? (vector-ref vec current-1)
(vector-ref vec current-2)))
(vector-set! holding target (vector-ref vec current-1))
(loop (+ target 1) (+ current-1 1) current-2))
(else
(vector-set! holding target (vector-ref vec current-2))
(loop (+ target 1) current-1 (+ current-2 1)))))))
The variable target counts off the positions in the holding
vector as they are filled up, from left to right. Current-1
keeps track of the position of the leftmost element in the first subvector
that has not yet been copied into the holding vector;
current-2 does the same for the second subvector. There are
three cases, corresponding to the three cond-clauses:
If current-1 has been incremented enough times to make it
equal to the ending position of the first subvector (which is
start-2), then the recursion can stop and all the elements
that have been moved to the holding vector can be copied back into
vec. (The remaining elements in the second subvector are
already in their correct sorted positions and need not be moved at all.)
If no more elements remain in the second subvector (because
current-2 has been incremented until it is equal to
end-2, or if the current element from the first subvector
should precede the current element of the second subvector, copy the
current element from the first subvector into the holding area, then
advance to the next position in the first subvector and in the holding
area.
Otherwise, copy the current element from the second subvector into the holding area, then advance to the next position in the second subvector and in the holding area.
In this definition of merge!, the identifiers
vec, holding, and precedes? are not
bound. To make it work, one must embed it in a context in which all of
these identifiers have been bound. We'll do this below.
The pair-merge! procedure walks through a vector, applying the
merge! procedure to pairs of adjacent subvectors of a
specified size. (Initially, the size is 1, just as the first call to
pair-merge in the constructive version of the merge sort
applied to single-element lists. The size will double on each successive
call.)
Unless the length of the vector is an exact power of 2, the
pair-merge! procedure will occasionally wind up with an
unpaired subvector at the end (which is simply left to be merged in on a
subsequent pass) or with two subvectors of unequal length. In the latter
case, the third argument to merge! is fixed at the length of
the entire vector vec to make sure that the merge terminates
correctly. Once again, there is an unbound identifier (len)
in the body of this definition, so that it should be embedded in a context
in which that identifier has been bound.
(define pair-merge!
(lambda (subvector-size)
(let loop ((start-1 0))
(if (< (+ start-1 subvector-size) len)
(let* ((start-2 (+ start-1 subvector-size))
(end-2 (min len (+ start-2 subvector-size))))
(merge! start-1 start-2 end-2)
(loop end-2))))))
Finally, the merge-sort! procedure invokes
pair-merge! repeatedly until it has merged all the subvectors
together:
(define merge-sort!
(lambda (precedes?)
(lambda (vec)
(let* ((len (vector-length vec))
(holding (make-vector len)))
;; Embedded definitions go here.
(do ((subvector-size 1 (* 2 subvector-size)))
((<= len subvector-size))
(pair-merge! subvector-size))))))
Assembling all the pieces yields the full algorithm:
(define merge-sort!
(lambda (precedes?)
(lambda (vec)
(let* ((len (vector-length vec))
(holding (make-vector len)))
(define merge!
(lambda (start-1 start-2 end-2)
(let loop ((target start-1)
(current-1 start-1)
(current-2 start-2))
(cond ((= current-1 start-2)
(copy-subvector! holding start-1 current-2 vec))
((or (= current-2 end-2)
(precedes? (vector-ref vec current-1)
(vector-ref vec current-2)))
(vector-set! holding target (vector-ref vec current-1))
(loop (+ target 1) (+ current-1 1) current-2))
(else
(vector-set! holding target (vector-ref vec current-2))
(loop (+ target 1) current-1 (+ current-2 1)))))))
(define pair-merge!
(lambda (subvector-size)
(let loop ((start-1 0))
(if (< (+ start-1 subvector-size) len)
(let* ((start-2 (+ start-1 subvector-size))
(end-2 (min len (+ start-2 subvector-size))))
(merge! start-1 start-2 end-2)
(loop end-2))))))
(do ((subvector-size 1 (* 2 subvector-size)))
((<= len subvector-size))
(pair-merge! subvector-size))))))
Define a Scheme procedure that takes a vector of real numbers and
determines whether its elements are arranged in ascending numerical order,
returning #t if it is and #f if it is not.
Design a test procedure that generates a vector of ten thousand integers
randomly selected from the range from 0 to 999999, calls
merge-sort! to sort the vector into ascending numerical order,
and checks whether the sort succeeded, returning #t if the
elements wind up in ascending numerical order and #f if they
do not.
This document is available on the World Wide Web as
http://www.math.grin.edu/courses/Scheme/spring-1998/merge-sort.html
created November 25, 1997
last revised June 21, 1998