CSC153, Class 19: Algorithm Analysis
Summary:
* Why prefer higher-order procedures
* Comparing algorithms
* Asymptotic analysis
* Eliminating constants
* How do you do it?
* Other notes
Notes:
* Exam 1 due Friday. Any final questions?
* How many of you have done this stuff before?
* Warning! I do read plans once in a while.
* No extra reading for Friday.
Work on the exam.
Read more of Stone.
* Send me information about three books
(with five adjecctives).
* Another cool stats talk at noon. Free pizza!
* Cool math talk today at 4:15.
----------------------------------------
Why some programmers prefer higher-order procedures
"Compute the inner product of two vectors, a and b"
The Mathematician
"The sum of the products of the individual pairs"
The C++ programmer
sum = 0;
for (int i = 0; i < A.length; ++i)
sum += A[i]*B[i];
The Scheme programmer
(insert + (map * A B))
or
(apply + (map * A B))
or even
(insert + (map (left-section apply *) (map list A B)))
Why would one like the C one?
* It's easier to convert to Math terms
* Easier to estimate the running time
Why would one like the Scheme one?
* It's easier to convert to Math terms
* Scheme is shorter. Don't have to worry about ++i or i++
* It's much easier to parallelize the Scheme version
----------------------------------------
We've just looked at different algorithms that solve the
same problem.
We often have many different algorithms to solve the same
problem.
Exponentiation:
* Sam's technique: Divide and conquer when exponent is even\
* Da Ma: Repeated multiplication
* Mathematician x^n = e^(n*ln(x))
How do we choose which one?
* Correctness: Does it work on all valid inputs?
Semi-correctness: Does it work on all inputs our program
will deal with
* Ease of implementation:
+ How quickly can I implement it?
+ How sure can I be that I got my implementation correct?
+ Length of code
* How fast does it actually run?
+ Time efficiency
* Ease of use
+ Less clueful programmers might call your procedure
* Robustness: What does it do on incorrect inputs?
* Memory efficiency
* Generality: Can it also solve related problems?
In practice, "How fast does it actually run" becomes the
primary consideration
It is difficult to analyze precisely how fast code will run
(in almost any language)
* "to analyze precisely" vs. "to precisely analyze"
Consider a loop with an internal conditional
(define largest-in-list
(lambda (lst)
(cond ((null? (cdr lst)) (car lst))
((> (car lst) (largest-in-list (cdr lst)))
(car lst))
(else (largest-in-list (cdr lst))))))
Different "basic" operations take different amounts of time
Don't worry about the details; worry about the general pattern
* Some algorithms always seem to take the same time
* Some algorithms seem to take some constant times the
number of values you're procesing
* Some algorithms seem to take ...
[Cool picture]
* The comparative analysis usually ignores small inputs
(because funky things happen with small inputs)
+ As the input gets large = "asymptotic"
* The comparative analysis usually ignores constant multipliers
Let's write some formal notation to help formalize these
concepts.
O(g(n)) "big O"
Provides an upper bound for functions.
Officially O(g(n)) is a *set* of functions
Goal: f(n) is in O(g(n)) if and only if
"for sufficiently big inputs, g(n) is at least as
big as f(n), ignoring constant multipliers"
f(n) in O(g(n)) if and only if there exist values
n0 and d > 0 such that
for all n > n0 f(n) <= d*g(n)
Since constants don't matter, we ignore them in writing
our typically running times
* f(n) is in O(1) "constant time"
+ vector-ref
* f(n) is in O(log_2(n)) "logarithmic time"
+ Divide and conquer exponent
* f(n) is in O(n) "linear time"
+ (length lst)
+ Divide and conquer exponent
+ Dave's exponent
What is the order of the "matheamtical exponent"
* It depends on the cost of e^x and ln(x)
When you find an upper bound, you want to find the smallest
upper bound.