CS153, Class 24: Fast Sorting Algorithms Topics: * Divide and Conquer * Merge sort * Quicksort * Lab Notes: * Lots of readings: Merge Sort, Quicksort, Randomness and Simulation. * New lab writeup. * Questions on homework 2? + Validate your HTML at http://validator.w3.org * Preconditions and Postconditions for insert? * Top-down or Bottom-up? ---------------------------------------- Document "insert" ;;; Procedure: ;;; insert ;;; Parameters: ;;; val, a Scheme value ;;; lst, a Scheme list (null or otherwise) ;;; can-come-before?, a binary predicate ;;; Purpose: ;;; Create a new list that contains all the values in ;;; lst and val and is sorted (see preconditions for ;;; definition of sorted). ;;; Produces: ;;; newlst ;;; Preconditions: ;;; lst must be sorted according to the scheme of can-come-before? ;;; That is, ;;; (can-come-before? (list-ref lst i) (list-ref lst (+ i 1))) ;;; for all "reasonable" i. ;;; can-come-before? can be applied to any two elements from ;;; (cons val lst) ;;; can-come-before? keeps returning the same values given ;;; the same arguments [Implicit for most Scheme procs] ;;; can-come-before? is transitive. ;;; Exists a permutation of (cons val lst) that is sorted. ;;; For all reasonable values a and b, at least one of ;;; the following must hold: ;;; 1. (can-come-before? a b) ;;; 2. (can-come-before? b a) ;;; Postconditions: ;;; newlst contains all the elements in lst and val. ;;; That is, ;;; newlst is a permutation of (cons val lst) ;;; newlst is sorted (see above). ;;; newlst is not null. Sam gets to play lazy programer > (insert 1 (list 0 4 9 11) <=) (0 1 2 3 4 5 6 7 8 9 10 11) > (insert 1 (list 0 4 9 11) <=) (0 0 0 0 1 1 1 1 1 4 4 4 9 11) Sam gets to play nasty user of your procedure > (insert 1 (list 2 2 2 2 2) =) (2 2 2 2 2 1) ; Brian returns ; Doesn't meet the postconditions ; LAWSUIT 'insert "Sorry, I was unable to insert" ; I met all the preconditions. You can't quit! ; Lawsuit! It may be overkill, but it forced us to think carefully. ---------------------------------------- Sorting faster than insertion and selection sort We improved searching by * Making some requirements on the input (must be sorted) * Divide in half at each step. How can we apply "divide in half" to the process of sorting? * First half and second half * Even positions and odd positions Sort both halves * We now have two halves that are in order * E.g., (1 3 5 6 9 10 11) (2 4 4 5 6 7 8) Repeatedly: * Add the smallest of the two lists to the result list (define merge (lambda (lst1 lst2 can-come-before?) (cond ((null? lst1) lst2) ((null? lst2) lst1) ((can-come-before? (car lst1) (car lst2)) (cons (car lst1) (merge (cdr lst1) lst2 can-come-before?))) (else (cons (car lst2) (merge lst1 (cdr lst2) can-come-before?))) ))) ; Pre lst1 and lst2 are sorted according to can-come-before? How long does it take to merge two lists into a list of length n? O(n) How long does it take to split a list of length n into two lists of length n/2? O(n) Running time of this algorithm: f(1) = 1 f(n) = 2*n + 2*f(n/2) f(n) is in O(n*log_2(n)) ---------------------------------------- Are there other ways to divide the input? * For years: Early years and late years * Generally: "Small" things and "large" things Everything in small <= everything in large (append (sort small) (sort large)) How do you decide what's large and what's small? 1. Find smallest, find largest, average (1 10 100 1000 10000 100000 1000000) 2. Weirdo strategy: Pick a "random" element of the list. Quicksort: Divide and conquer sorting using that split strategy Running time: O(nlog_2(n)) if you usually pick well O(n^2) if you regularly pick badlya LAB ON MONDAY (and please show up on time)