CSC151 2009F, Class 44: Binary Search Admin: * I'm a bit disappointed with the increasing number of people missing class. * Missing when you're sick is acceptable. * Missing for other reasons is not so acceptable. * But whatever reason makes you miss, you have a responsibility to notify me. * I have been trying to acknowledge the stressful part of the semester but cutting work. * There is no reading for Friday. * Use the extra time to work on your projects. * Project proposals due today. * Sketches * Reminder: Projects are due next Tuesday. * EC for * Thursday's convocation. * Thursday's CS Extra on Bioinformatics (Thursday at 4:15, 3821). * Friday's CS Table on "Computational Thinking" (noon, PDR) * Readings available in class today. * Friday night's swim meet. * As You Like It * Chamber Ensemble Saturday at 4pm * Amazing Percussion plus Flute ensemble Friday night Overview: * The problem of searching. * Analyzing algorithmic efficiency. * Making searching more efficient using divide-and-conquer. Recent topic in class: Association Lists * Basic idea: Given a list of values, search for a particular value * Requires a particular arrangement of your data * List of values (vs. a vector or tree) * Each value must be a list or pair * The part of the value used for searching is the car * We can extend: Instead of looking at the car, we can look at anyh element * Yay recursion! * We have a straightforward and seemingly correct solution to the problem of searching. * Question: Can we make searching more efficient? * Basic recursion-based searching in lists: * Requires you to look at each element, in turn, until you find it or run out of elements. * "On average", about N/2 elements (N = (length lst)) to look at * Can we make this better (as we do in searching phone books)? * "Find the first letter of the last name" * THen look one by one * Can we generalize? * Look in the middle * Magically, the thing you're looking for is in the middle * The thing in the middle comes after the thing you're looking for Throw away the second half Start all over (recurse!) * The thing in the middle comes before the thing you're looking for Throw away the first h alf Start all over (recurse!) Is this really any better than "look at each thing in turn"? * Yes, we're throwing away massive amounts of stuff * Suppose we were dealing with the NYC phone directory 8 million 4 million 2 million 1 million 500 K 250 K 125 K 64 K 32 K 16 K 8 K 4 K 2 K 1 K 500 250 125 63 32 16 8 4 2 1 * About log_2(N) steps Requirements: * The thing we're seaching must already be sorted by the key * It needs to be easy to find the middle element * And to throw away half * Won't work for lists: Can't find the middle element quickly * But it's quick to find the middle element of a vector * How do we throw away half a vector quickly? * We just keep track of the positions of the portion of the vector of interest * Suppose we're writing binary search (define binary-search (lambda (value-we-are-looking-for the-vector-we-are-looking-through a-procedure-that-gives-back-the-key-of-an-entry a-way-to-compare-keys) (let kernel ((starting-position-of-range-of-interest 0) (ending-position-of-range-of-interest (- (vector-length vec) 1))) ...