CSC153 2004S, Class 20: Algorithm Analysis (1) Admin: * Questions on homework 4? * Try working with the arrays directly, and not just converting to lists and back * Optional reading available this afternoon. * Algorithm design and hop * Notes on tail recursion * What's the benefit of tail recursion, don't you still need the stack? * No, a good implementation can replace the arguments "in place" * So what's the difference in efficiency between tail-recursion and iteration? * There is none * So why bother with tail-recursion? (Why not just have iteration.) * Because the designers of Scheme didn't want to include iteration. (Well ... more "Sam didn't want to teach you the iteration operation in Scheme.") * Some algorithms are more natural to solve recursively. (Are there algorithms that are more naturally tail-recursive than iterative?) * Tail-recursion is a natural stage between general recursion and iteration. * Often easier to reason about tail-recursive procedures. Overview: * Choosing between different algorithms that solve the same problem * Example: suml vs. sumr * Example: Verified sum * Measuring the running time of an algorithm * Problems * Asymptotic analysis Given two algorithms that solve the same problem, how do you decide which is "better". * The one that is quicker to write. * The one that runs more quickly. <--- The key issue * The one that is easier to read. * The one that uses the least memory. * Microsoft solution: The one that does everything, including making sandwiches * The one that is shortest (least code) * The one that works correctly (we'll assume they both work correctly) * The one that is most user friendly (however, they don't interact with the user) * The one easiest to modify for other problems. * The one easiest to prove correct. Given two algorithms, how do you decide which is "faster"? * Time them, running on the same input * Unfortunately, the number of steps an algorithm takes on the same size input is not consistent * Unfortunately, an algorithm that works better on small inputs does not necessarily work better on large inputs Comparing suml to sumr: * For an input of size 100,000, suml takes about .1 seconds and sumr takes about .16 seconds * For an input of size 200,000, suml takes about .2 seconds and sumr takes about .8 seconds * For an input of sizee 400,000, suml takes about .4 seconds and sumr takes about 2 seconds Observations: * The running time of suml is directly proportional to the size of the array * The running time of sumr doesn't seem to be as proportional The Mathematician's solution: Come up with a more general model for the *shape* of the curve (running time vs. input) * Why: The general shape of the curve will tell you more about the comparative running time on larger inputs * Why: Sometimes general shape is easier to compute How do we model the shape of a curve? * Scientist: Experimentally * Better scientist or Mathematician: Analyze the program to see what shape curve you expect Theory: * If you take "worst" inputs * And you look at big enough inputs that other factors don't come into play * Then most algorithms have a natural "upper bound" curve * Which may or may not be easy to find How we estimate that running time, informally * For iterative programs (no recursion): + Count any "basic action" (addition, multiplication, input, etc) as one step + For a conditional, take the more expensive of its two branches and add the cost of the comparison + For a loop: Multiply the worst number of repetitions by the cost of the body + For a function call: Do the same analysis of the function call procedure selection-sort(array-of-ints numbers) for i = 0 to numbers.length swap(numbers[i], indexOfSmallest(numbers, i) procedure indexOfSmallest(array-of-ints numbers, int start) guess = start; for j = start + 1 to numbers.length if numbers[j] > numbers[guess] guess = j; return guess; How long does this take? number.length repetitions * (1 for array index + 1 for swap + cost of indexOfSmallest) indexOfSmallest (numbers.length repetitions of about 3 steps) numbers.length * (2 + 3*numbers.length) Because this is such a rough estimate, it's silly to pay attention to the constant addition and multipliers. The shape of the curve is N^2. The running time is O(numbers.length^2) Let's talk about shape more formally. Big-O notation. O(f(n)) is a set of functions that are "bounded above" by the shape of f(n) for large enough inputs g(n) is in O(f(n)) if there exist n0, c such that |c*g(n)| <= f(n) for all n > n0 n0 is the "for big enough input" c is "the shape of the curve matters, not the constant multiplier" 2*n^2 is in O(n^2) Proof: Let c be 1/2 and n0 be 1 1/2 * 2 * n^2 <= n^2 for all n > 1 How does this relate to running time? We describe the running time of algorithms in terms of big-O notation. That running time of selection sort (above) is in O(n^2) Big-O notation is nice because it lets us (1) Ignore constants (2) Ignore smaller-order terms Thm: If g(x) is d*f(x) (d not 0) then g(x) is in O(f(x)) Proof: Let c = 1/d, n0 = 0 c*g(x) = 1/d * g(x) = 1/d * d * f(x) = f(x) <= f(x) Result: We never say that the running time of an algorithm is, say O(3x). We instead ignore the constant multiplier and say its O(x) Thm: if g(x) is in O(f(x)) then h(x) = f(x) + g(x) is in O(f(x)) Proof by nodding Result: We never have to say the running time is O(n^2 + n) but rather O(n^2) Note: This is a *very* rough estimate. How do we get more accurate? (1) little-o for lower bounds (2) theta notation for "exact" bounds (3) Start being more of a scientist Note that you usually want to identify the closest possible bounding function. If f(n) is in O(n^2) it is also in O(n^3) and O(n^100) How do we analyze a recursive procedure? More or less the same thing: When you count the steps for the recursive call, you're stuck with a recursive definition of time. (define selection-sort-numbers (lambda (vec) (let kernel ((pos 0)) (if (< pos (vector-length vec)) (begin (swap vec pos (index-of-smallest vec pos)) (kernel (+ pos 1))))))) Running time of the kernel on a vector of length n: 1 for conditional 1 for swap n (length of vector) for index-of-smallest cost of running kernel on n-1 values Technique for analysis: Recurrence relations Have a good day (woo hoo!)