Fundamentals of Computer Science I: Media Computing (CS151.02 2007F)
In looking at algorithms, we often ask ourselves how many
“steps” the algorithm typically uses. Rather than
looking at every kind of step, we tend to focus on particular
kinds of steps, such as the number of times we have to call
vector-set! or the number of values we look at.
Let's try to look at how much effort the insertion sort algorithm
expends in sorting a list of
n values, starting from a
random initial arrangement. Recall that insertion sort uses two lists:
a growing collection of sorted values and a shrinking collection of
values left to examine. At each step, it inserts a value into the
collection of sorted values.
On average, insertion sort has to look through about half of the
elements in the sorted part of the data structure to find the
correct insertion point for each new value it places. The size of
that sorted part increases linearly from 0 to
its average size is
n/2 and the average number of
comparisons needed to insert one element is
Taking all the insertions together, then, the insertion sort
comparisons to sort the entire set. That is, we do an average
n times, giving
This function grows much more quickly than the size of the input list. For example, if we have 10 elements, we do about 25 comparisons. If we have 20 elements, we do about 100 comparisons. If we have 40 elements, we do about 400 comparisons. And, if we have 100 elements, we do about 2500 comparisons.
Accordingly, when the number of values to be sorted is large (greater than one thousand, say), it is preferable to use a sorting method that is more complicated to set up initially but performs fewer comparisons per value in the list. In this reading, we explore one such procedure.
As we saw in the case of binary search, it is often profitable to divide an input in half. For binary search, we could throw away half and then recurse on the other half. Clearly, for sorting, we cannot throw away part of the list. However, we can still rely on the idea of dividing in half. That is, we'll divide the list into two halves, sort them, and then do something with the two result lists.
Here's a sketch of the algorithm in Scheme:
(define new-sort (lambda (stuff may-precede?) ; If there are only zero or one elements in the list, ; the list is already sorted. (if (or (null? stuff) (null? (cdr stuff))) stuff ; Otherwise, split the list in half (let* ((halves (split stuff)) (firsthalf (car halves)) (secondhalf (cadr halves)) ; And sort each half (sortedfirst (new-sort firsthalf)) (sortedsecond (new-sort secondhalf))) ; Do some more stuff ???))))
But what do we do once we've sorted the two sublists? We need to put them back into one list. Through habit, we refer to the process of joining two sorted lists as merging. It is relatively easy to merge two lists: You repeatedly take the smallest remaining element of either list. When do you stop? When you run out of elements in one of the lists, in which case you use the elements of the remaining list. Putting it all together, we get the following:
;;; Procedure: ;;; merge ;;; Parameters: ;;; sorted1, a sorted list. ;;; sorted2, a sorted list. ;;; may-precede?, a binary predicate that compares values. ;;; Purpose: ;;; Merge the two lists. ;;; Produces: ;;; sorted, a sorted list. ;;; Preconditions: ;;; may-precede? can be applied to any two values from ;;; sorted1 and/or sorted2. ;;; may-precede? represents a transitive operation. ;;; sorted1 is sorted by may-precede? That is, for each i such that ;;; 0 <= i < (length sorted1) ;;; (may-precede? (list-ref sorted1 i) (list-ref sorted1 (+ i 1))) ;;; sorted2 is sorted by may-precede? That is, for each i such that ;;; 0 <= j < (length sorted2) ;;; (may-precede? (list-ref sorted2 j) (list-ref sorted2 (+ j 1))) ;;; Postconditions: ;;; sorted is sorted by may-precede?. ;;; For each k, 0 <= k < (length sorted) ;;; (may-precede? (list-ref sorted k) (list-ref sorted (+ k 1))) ;;; sorted is a permutation of (append sorted1 sorted2) ;;; Does not affect sorted1 or sorted2. ;;; sorted may share cons cells with sorted1 or sorted2. (define merge (lambda (sorted1 sorted2 may-precede?) (cond ; If the first list is empty, return the second ((null? sorted1) sorted2) ; If the second list is empty, return the first ((null? sorted2) sorted1) ; If the first element of the first list is smaller or equal, ; make it the first element of the result and recurse. ((may-precede? (car sorted1) (car sorted2)) (cons (car sorted1) (merge (cdr sorted1) sorted2 may-precede?))) ; Otherwise, do something similar using the first element ; of the second list (else (cons (car sorted2) (merge sorted1 (cdr sorted2) may-precede?))))))
All that we have left to do is to figure out how to split a list into
two parts. One easy way is to get the length of the list and then
cdr down it for half the elements, accumulating the skipped elements
as you go. Since it's easiest to accumulate a list in reverse order,
we re-reverse it when we're done. (Merge sort doesn't really care
that they're in the original order, but perhaps we want to use
split for other purposes.)
;;; Procedure: ;;; split ;;; Parameters: ;;; lst, a list ;;; Purpose: ;;; Split a list into two nearly-equal halves. ;;; Produces: ;;; halves, a list of two lists ;;; Preconditions: ;;; lst is a list. ;;; Postconditions: ;;; halves is a list of length two. ;;; Each element of halves is a list (which we'll refer to as ;;; firsthalf and secondhalf). ;;; lst is a permutation of (append firsthalf secondhalf). ;;; The lengths of firsthalf and secondhalf differ by at most 1. ;;; Does not modify lst. ;;; Either firsthalf or secondhalf may share cons cells with lst. (define split (lambda (lst) ;;; kernel ;;; Remove the first count elements of a list. Return the ;;; pair consisting of the removed elements (in order) and ;;; the remaining elements. (let kernel ((remaining lst) ; Elements remaining to be used (removed null) ; Accumulated initial elements (count ; How many elements left to use (quotient (length lst) 2))) ; If no elements remain to be used, (if (= count 0) ; The first half is in removed and the second half ; consists of any remaining elements. (list (reverse removed) remaining) ; Otherwise, use up one more element. (kernel (cdr remaining) (cons (car remaining) removed) (- count 1))))))
In the corresponding lab, you'll have an opportunity to consider other ways to split the list. In that lab, you'll work with a slightly changed version of the code.
We saw most of the
merge-sort procedure above,
but with a bit of code left to fill in. Here's a new version, with
that code filled in (and a few other changes).
(define merge-sort (lambda (stuff may-precede?) ; If there are only zero or one elements in the list, ; the list is already sorted. (if (or (null? stuff) (null? (cdr stuff))) stuff ; Otherwise, ; split the list in half, ; sort each half, ; and then merge the sorted halves. (let* ((halves (split stuff)) (some (car halves)) (rest (cadr halves))) (merge (merge-sort some may-precede?) (merge-sort rest may-precede?) may-precede?)))))
There's an awful lot of recursion going on in merge sort as we repeatedly split the list again and again and again until we reach lists of length one. Rather than doing all that recursion, we can start by building all the lists of length one and then repeatedly merging pairs of neighboring lists. For example, suppose we start with sixteen values, each in a list by itself.
((20) (42) (35) (10) (69) (92) (77) (27) (67) (62) (1) (66) (5) (45) (25) (90))
When we merge neighbors, we get sorted lists of two elements. At some
places such as when we merge
the elements stay in their respective order. At other places, such
as when we merge
(10), we need to
swap order to build ordered lists of two elements.
((20 42) (10 35) (69 92) (27 77) (62 67) (1 66) (5 45) (25 90))
Now we can merge these sorted lists of two elements into sorted lists
of four elements. For example, when we merge
(10 35), we first take the 10 from the second list,
then the 20 from the first list, then the 35 from the second list,
then the 42 that is all that's left.
((10 20 35 42) (27 69 77 92) (1 62 66 67) (5 25 45 90))
We can merge these sorted lists of four elements into sorted lists of eight elements.
((10 20 27 35 42 69 77 92) (1 5 25 45 62 66 67 90))
Finally, we can merge these sorted lists of eight elements into one sorted list of sixteen elements.
((1 5 10 20 25 27 35 42 45 62 66 67 69 77 90 92))
Now we have a list of one list, so we take the car to extract the list.
(1 5 10 20 25 27 35 42 45 62 66 67 69 77 90 92)
Translating this technique into code is fairly easy.
We use one helper,
merge-pairs to merge neighboring pairs.
We use a second helper,
repeat-merge to repeatedly call
merge-pairs until there are no more pairs to merge.
(define new-merge-sort (lambda (lst may-precede?) (letrec ( ; Merge neighboring pairs in a list of lists (merge-pairs (lambda (list-of-lists) (cond ; Base case: Empty list. ((null? list-of-lists) null) ; Base case: Single-element list (nothing to merge) ((null? (cdr list-of-lists)) list-of-lists) ; Recursive case: Merge first two and continue (else (cons (merge (car list-of-lists) (cadr list-of-lists) may-precede?) (merge-pairs (cddr list-of-lists))))))) ; Repeatedly merge pairs (repeat-merge (lambda (list-of-lists) ; Show what's happening ; (write list-of-lists) (newline) ; If there's only one list in the list of lists (if (null? (cdr list-of-lists)) ; Use that list (car list-of-lists) ; Otherwise, merge neighboring pairs and start again. (repeat-merge (merge-pairs list-of-lists)))))) (repeat-merge (map list lst)))))
At the beginning of this reading, we saw that insertion sort takes
steps to sort a list of
n elements. How long does
merge sort take? We'll look at
since it's easier to analyze. However, since it does essentially the
same thing as the original
merge-sort, just in
a slightly different order, the running time will be similar.
We'll do our analysis in a few steps. First, we will consider the
number of steps in each call to
we will consider the number of times
merge-pairs. Finally, we'll put the two
together. To make things easier, we'll assume that
(the number of elements in the list) is a power of two.
of length 1 to merge them into
n/2 lists of
length 2. Building a list of length 2 takes approximately two
merge-pairs takes approximately
n steps to do its
first set of merges.
of length 2 to merge them into
n/4 lists of
length 4. Building a merged list of length 4 takes approximately
four steps, so
merge-pairs takes approximately
n steps to build
n/4 list of
repeat-merge to merge
lists of length 2
lists of length 2
A little math suggests that this once again takes approximately
So far, so good. Now, how many times do we call
merge-pairs? We go from lists of length 1, to lists
of length 2, to lists of length 4, to lists of length 8, ...,
to lists of length
n/4, to lists of length
n/2, to one list of length
How many times did we call
The number of times we need to multiply 2 by itself to get
n. As we've noted before, the formal name for that
value is log2
To conclude, merge sort repeats a step of
times. Hence, it takes approximately
Is this much better than insertion sort, which took approximately
Here's a chart that will help you compare various running times.
As you can see, although the two sorting algorithms start out taking approximately the same time, as the length of the list grows, the relative cost of using insertion sort becomes a bigger and bigger ratio of the cost of using merge sort.
In the laboratory, you'll have an opportunity to analyze experimentally how many steps each algorithm uses.
You may have noted that we have not yet written the documentation for merge sort. Why not? Because it's basically the same as the documentation for any other sorting routine.
Copyright © 2007 Janet Davis, Matthew Kluber, and Samuel A. Rebelsky. (Selected materials copyright by John David Stone and Henry Walker and used by permission.)
This material is based upon work partially supported by the National Science Foundation under Grant No. CCLI-0633090. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.
This work is licensed under a Creative Commons
Attribution-NonCommercial 2.5 License. To view a copy of this
or send a letter to Creative Commons, 543 Howard Street, 5th Floor,
San Francisco, California, 94105, USA.