CSC153, Class 20: More Algorithm Analysis Overview: * Review: Big-O Notation * Doing Big-O Analysis * Dominant terms and other relationships * Some recurrence relations * Experimental analysis Notes: * Exam 1 due. * Homework 2 ready. Two fun Web services. Use higher-order stuff. * I'll be gone this weekend. I'll try to answer any email. ---------------------------------------- Does anyone remember what Big-O analysis is? * A way to examine how many steps a program takes. + Approximates + As a function + Giving an upper bound + Ignores constant multipliers * Formal definition f(n) is in O(g(n)) if and only if there exist n0, d > 0 such that for all n > n0 |f(n)| <= d*g(n) How do we do Big O analysis? + Count each "basic" operation as one step. + Repetition of a series of steps - Count the cost of the series of steps - Count the number of repetitions - Multiply + Choice between multiple cases - Count each case - Take the longest + Procedure/subroutine call - Number of steps in the subroutine How long does the Mathematical exponentiation algorithm take? * It depends on how long the functions it depends on take Some analyses the formal steps help with Suppose f(n) is in O(g(n)) and h(n) is in O(k(n)) What bound can we give for f(n)+h(n)? + Context: I've bounded the first half of my algorithm I've bounded the second half of my algorithm I do the two halves in sequence What's the bound? O(g(n)+k(n)) Proof: f(n) is in O(g(n)) implies there exist n00, d0 > 0 such that for all n > n00 |f(n)| <= d0*g(n) h(n) is in O(k(n)) implies there exist n01, d1 > 0 such that for all n > n01 |h(n)| <= d1*k(n) Detour: How do we show that (f(n)+h(n)) is in O(g(n)+k(n)) find n0, d > 0 such that for all n > n0 |f(n)+h(n)| <= d*(g(n)+h(n)) n0 = max(n00,n01) d = max(d0,d1) |f(n)+h(n)| <= |f(n)| + |h(n)| <= d0*g(n) + d1*k(n) // For n > n0 <= d*g(n) + d*k(n) <= d*(g(n)+k(n)) Quod.Erat.Demonstratum. +-+ | | +-+ ---------------------------------------- f(n) is in O(g(n)) h(n) is in O(k(n)) f(n)*h(n) is in O(g(n)*h(n)) * Bounding loops f(n) is in O(g(n)), g(n) is in O(h(n)) f(n) relative to h(n)? + If I wrote, f <= g and g <= h, you'd conclude f <= h + Similar here: f(n) is in O(h(n)) f(n) is in O(k*g(n)) where k is a constant greater than 0 f(n) is in O(g(n)) Proof: exist n0, dold s.t. f(n) <= dold*k*g(n) for all n > n0 [By definition of Big O let dnew be dold*k Now f(n) <= dnew*g(n) = dold*k*g(n) WE HAVE PROVEN THAT YOU CAN IGNORE CONSTANT MULTIPLIERS f(n) is in O(g(n)+h(n)) and h(n) is in O(g(n)) g(n)+h(n) is in O(2*g(n)) f(n) is in O(g(n)) WHEN YOU HAVE O(TWO FUNCTIONS), YOU CAN THROW AWAY THE SMALLER f(n) is O(n^2 + 432n) f(n) is O(n^2) ---------------------------------------- So how do we analyze recursive functions? Let's analyze length (define length (lst) (if (null? lst) 0 (+ 1 (length (cdr lst))))) When there's a test, analyze both parts add 1 to largest computing 0 takes 1 step computing (+ 1 (length (cdr lst))) takes adding 1 takes one step taking the cdr takes one step computing (length ...) takes ??? steps We have a recursive definition of the running time of length t(0) = 2 t(n) = 3 + t(n-1) How Sam analyzes them when he forgets the details Expand repeatedly until you see a pattern. t(n) = 3 + t(n-1) = 3 + 3 + t(n-2) = 2*3 + t(n-2) = 2*3 + 3 + t(n-3) = 3*3 + t(n-3) = 3*3 + 3 + t(n-4) = 4*3 + t(n-4) = k*3 + t(n-k) Eventually, n-k must be 0 (when n=k) = n*3 + t(0) = n*3 + 2 is in O(n) Suppose t(0) = b t(n) = c*t(n-1) t(n) = c*t(n-1) = c*c * t(n-2) = c^2 * t(n-2) = c*c^2 * t(n-3) = c^3 * t(n-3) t(n) = b*c^n = O(c^n) = O(2^n) t(n) = c + 2*t(n-1) ----- Suppose t(0) = 0 t(n) = n + t(n-1) n + n-1 + n-2 .... + 3 + 2 + 1 = n(n+1)/2 is in O(n^2)