# Exam 4 – Advanced Scheming

Assigned
Friday, Dec 7, 2018
Due
Thursday, Dec 13, 2018 by 10:30pm
Collaboration
You may not receive help from any person other than the instructor for this exam. See the exam procedures for a detailed description of the resources you can and cannot use.
Submitting

Please read the exam procedures page for policies, turn-in procedures, and grading details. If you have any questions about the exam, check the Q&A at the bottom of this page. We will generally spend some time in class every day on questions and answers while the exam is in progress.

While the exam is out, please check back periodically to see if we have reported any new errata.

Complete the exam using the exam4.rkt starter source code.

## Problem 1: Create a tsil datatype

Topics: pairs and pair structures, lists, recursion

We’ve seen many types of data in Scheme, but two have had an interesting recursive structure: lists and binary trees. For example, consider the following definition of a list:

A list is:

• null, or
• a pair of some value and a list.

For this problem, you will create a new recursively-defined datatype known as a tsil. A tsil is:

• llun, or
• a pair of a tsil and some value.

The only hard requirement for creating a datatype in Scheme is to write a predicate that can check if a value conforms to the definition of the datatype. Write and document the procedure (tsil? val) that checks if a value is a tsil. The value llun is defined in your starter code, as is the predicate (llun? val).

Consider the following example uses of tsil?:

> (tsil? (list 1 2 3))
#f
> (tsil? (cons llun 2))
#t
> (tsil? (cons null 5))
#f
> (tsil? llun)
#t
> (tsil? (cons (cons (cons llun 3) 2) 1))
#t
> (tsil? "I am a tsil")
#f
> (tsil? 'no-really-I-am-a-tsil)
#f


## Problem 2: Working with tsil values

Topics: pairs and pair structures, lists, structural recursion

For the previous problem, you implemented a tsil? predicate to work with our new tsil datatype. While this is sufficient to consider our datatype complete, it would be more useful if we had some utilities to work with tsil values. Your task for this problem is to write, but not document, four tsil procedures:

• (tsil-ref tsl index), which takes a tsil and index, and returns the value at the given offset.
• (list->tsil lst), which takes a list and produces an equivalent tsil
• (tsil->list tsl), which takes a tsil and produces an equivalent list
• (tsil-length tsl), which takes a tsil and returns its length

We consider a list and tsil equivalent if they contain the same values at the same indices. For a concrete example, consider the list produced by (list 1 2 3). The equivalent tsil is (cons (cons (cons llun 3) 2) 1), as the 1 is held in the outermost pair, the 2 is in the next pair, and the 3 is in the innermost pair.

Consider the following example uses of these procedures as you complete your implementations:

> (define my-tsil (cons (cons (cons llun 3) 2) 1))
> my-tsil
'(((llun . 3) . 2) . 1)
> (tsil-length my-tsil)
3
> (tsil-ref my-tsil 0)
1
> (tsil-ref my-tsil 2)
3
> (tsil->list my-tsil)
'(1 2 3)
> (define another-tsil (list->tsil (list "this" "is" #\a 'tsil)))
> another-tsil
'((((llun . tsil) . #\a) . "is") . "this")
> (tsil-length another-tsil)
4
> (tsil->list another-tsil)
'("this" "is" #\a tsil)


## Problem 3: Whatzitdo?

Sometimes students (and professors) come up with difficult-to-read solutions to problems. Here is one such solution:

(define
enifed
(lambda
(rambda
schmambda)
(let
tel (
[rdc
(cdr
schmambda)]
[rac
(car
schmambda)])
(if
(null?
rdc)
rac
(tel
(cdr
rdc)
(rambda
rac
(car
rdc)))))))


Figure out what this procedure does and then do the following:

a. Rename the procedure and parameter(s) so that their type and purpose is clear.

b. Reformat the code.

c. Explain how this procedure achieves its goal.

d. Write 6P-style documentation for the code.

You do not need to preserve all the versions of this code; your exam file only has to contain 6Ps, the reformatted and renamed procedure, and your explanation for part c.

## Problem 4: Analyzing quicksort

Topics: sorting, lists, algorithm analysis

You’ve learned one sorting mechanism that applies the divide-and-conquer technique to achieve faster sorting: merge sort. But merge sort is not the only divide-and-conquer sorting algorithm. There’s a famous alternative called Quicksort. While Quicksort is normally done with vectors, we’ll consider a list-based version.

Merge sort divides the elements without considering the values of the elements. Can we perhaps gain power by looking at the values? Consider the problem of sorting a list of integers. If we knew the median value, we could break the list into three lists: the values less than the median, the values equal to the median, and the values greater than the median. Once we’ve done that, we can recursively sort the list of smaller values and the list of larger values and then append the three together.

Suppose we had the list '(5 2 9 1 3 8 0 4 2 7 5). The median value is 4. So we separate it into three lists.

• Values less than 4: '(2 1 3 0 2)
• Values equal to 4: '(4)
• Values greater than 4: '(5 9 8 7 5)

We sort the two lists. (How? Recursively!)

• Values less than 4, sorted: '(0 1 2 2 3)
• Values equal to 4: '(4)
• Values greater than 4: '(5 5 7 8 9)

And then we join everything together with append, giving us '(0 1 2 2 3 4 5 5 7 8 9).

How do we write this procedure? It’s fairly straightforward. First, we’ll write the procedure that separates the list into three parts.

;;; Procedure:
;;;   partition
;;; Parameters:
;;;   lst, a list of real numbers
;;;   num, a real number
;;; Purpose:
;;;   Separate lst into three lists: things less than num, things equal
;;;   to num, and things greater than num.
;;; Produces:
;;;   vol, a vector of lists of real numbers of the form
;;;     #(smaller equal greater).
;;; Preconditions:
;;; Postconditions:
;;;   * (append smaller equal greater) is a permutation of lst.
;;;   * All values in smaller are less than num.
;;;   * All values in equal are equal to num.
;;;   * All values in greater are larger than num.
(define partition
(lambda (lst num)
(let kernel ([remaining lst]
[smaller null]
[equal null]
[greater null])
(if (null? remaining)
(vector smaller equal greater)
(let [(val (car remaining))]
(kernel (cdr remaining)
(if (< val num) (cons val smaller) smaller)
(if (= val num) (cons val equal) equal)
(if (> val num) (cons val greater) greater)))))))


Next, we will write the primary Quicksort procedure.

;;; Procedure:
;;;   quicksort
;;; Parameters:
;;;   lst, a list of real numbers
;;; Purpose:
;;;   Sort the list using the legendary quicksort algorithm.
;;; Produces:
;;;   sorted, a list
;;; Preconditions:
;;; Postconditions:
;;;   * sorted is a permutation of lst.  No elements are added or removed.
;;;   * sorted is sorted.  For all reasonable i, (list-ref sorted i) <=
;;;     (list-ref sorted (+ i 1)).
(define quicksort
(lambda (lst)
(if (or (null? lst) (null? (cdr lst)))
lst
(let [(parts (partition lst (median lst)))]
(append (quicksort (vector-ref parts 0))
(append (vector-ref parts 1)
(quicksort (vector-ref parts 2))))))))


How do we find the median value of a list of real numbers? It turns out that that’s a hard problem. The normal approach is “sort the list and look in the middle”. But that doesn’t work if we’re using the median to sort. The typical implementation of Quicksort makes an interesting choice: It selects a random element of the list and uses that as a median.

(define random-element
(lambda (lst)
(list-ref lst (random (length lst)))))


Is this a good choice? Let’s do some experimental analysis to find out.

a. Update the quicksort and partition procedures above so they counts calls to car, cdr, null? and cons. To do so, you will likely need to find or write versions of list-ref, length, and append that you can annotate to count calls.

b. Run a series of experiments on lists of length 100, 200, 400, and 800 to determine whether the Quicksort algorithm appears to take approximately (* c n (log_2 n)) steps for some constant c.

You should do experiments with lists that are already sorted (e.g., using iota to generate the lists). You should do experiments with lists whose elements are randomly generated. Make sure to do at least five experiments for each size/type and to include evidence of your experiments in the exam.

                        sorted, average                 random, average
n     nlogn     n*n     car   cdr   cons  null?         car   cdr   cons  null?
---   -----   -----     ---   ---   ----  -----         ---   ---   ----  -----
100     664   10000
200    1529   40000
400    3458  160000
800    7715  640000


c. What do your experiments tell you about the running time of the list based Quicksort? Does it seem like nlogn, n*n, or something else?

## Problem 5: Decision Trees

Topics: trees, predicate procedures, recursion

We have used trees to store data, either in a free-form way, or with constraints on ordering as in binary search trees. However, trees can do much more than just store data. Decision trees are one example use of trees for something other than data storage.

A decision tree is a tree that contains a series of questions (predicates, in Scheme lingo). Given an input value, we can execute the decision tree by beginning at the root of the tree. The contents of the root node will be a predicate, which we run on our input value. If the result of the predicate is #t, we proceed down the right subtree. If the result is #f, proceed down the left subtree. Once we reach a leaf node, we return the contents of the node as our final answer.

For example, consider the decision tree (node odd? (leaf "even number") (node positive? (leaf "negative odd number") (leaf "positive odd number"))), which we will run on the input 7. At the root of this tree, we see the odd? predicate. We can run this predicate on our input, 7, which returns #t. We then proceed to the right subtree, which contains the predicate positive?. We evaluate (positive? 7), which also returns #t. We then move to the right subtree, where we encounter a leaf. Instead of running a predicate, when we encounter a leaf we simply return the contents of the leaf node, in this case a "positive odd number". Had we evaluated the tree with a negative odd number, we would instead produce "negative odd number", and if we had run the tree on an even number we would receive "even number".

Write, but do not document, a procedure (run-decision-tree decision-tree input) that follows this process to run a decision tree on a given input.

The following examples illustrate a more interesting application of decision trees. We will first define a helper procedure:

; Look in a list of past courses to see if a student has taken a particular course
(define has-taken
(lambda (course past-courses)
(not (= -1 (index-of course past-courses)))))


Next, we define a decision tree. This decision tree takes as input a list of courses a student has taken in the CS introductory sequence, and returns a list of courses that student is now eligible to take, having satisfied the prerequisites:

(define cs-intro-sequence
(node (section has-taken "CSC 151" <>)
(leaf (list "CSC 151"))
(node (section has-taken "CSC 161" <>)
(leaf (list "CSC 161"))
(node (section has-taken "CSC 207" <>)
(leaf (list "CSC 207"))
(leaf "All done!")))))


Consider the following example uses of this decision tree:

> (run-decision-tree cs-intro-sequence (list "CSC 151"))
'("CSC 161")
> (run-decision-tree cs-intro-sequence (list "CSC 151" "CSC 161"))
'("CSC 207")


You can find a much larger decision tree that covers the entire core of the computer science curriculum at Grinnell in the exam starter code. This decision tree does not check for electives, but does include all courses that count as required courses toward the major. Here are some sample uses of that decision tree:

> (run-decision-tree cs-curriculum (list "CSC 151" "CSC 161"))
'("CSC 207" "CSC 208" "CSC 211" "CSC 213")
> (run-decision-tree cs-curriculum (list "CSC 151" "CSC 161" "CSC 208"))
'("CSC 207" "CSC 211" "CSC 213")
> (run-decision-tree cs-curriculum (list "CSC 151" "CSC 161" "CSC 207" "CSC 208"))
'("CSC 211" "CSC 213" "CSC 301" "CSC 324" "CSC 341")
> (run-decision-tree cs-curriculum (list "CSC 151" "CSC 161" "CSC 207" "CSC 208" "CSC 211" "CSC 213" "CSC 301" "CSC 324" "CSC 341"))
"All done! Just make sure you've completed your elective."


Your run-decision-tree implementation must be able to run both the cs-intro-sequence and cs-curriculum decision trees, but your three examples must use a decision tree of your own design.

We will post answers to questions of general interest here while the exam is in progress. Please check here before emailing questions!

### Problem 1

I see that a symbol is not a tsil, but 'llun both a tsil and a symbol. What gives?
Recall the definition of a tsil: a tsil is either the value llun (which is defined as a symbol in the starter code), or a pair of a tsil and some value.
Why isn’t (cons 3 llun) a tsil?
The definition for a tsil says that a tsil is either llun or a pair of a tsil and some value. The order of values in the pair is not arbitrary; the same is true for lists, but tsils expect the recursive value in the car rather than the cdr position of the pair.
Is (list llun 1 2 3) a tsil?
Remember that lists are pair structures. We could rewrite that expression as (cons llun (list 1 2 3)). This is a pair of a llun and some value, so it is indeed a tsil. However, it’s a tsil of length 1, which may not be what you expected.
Can we assume that you know what a tsil is when we document tsil?.
Yes. You can reference the definition of a tsil.
My implementation of tsil? crashes on some of your inputs. That makes me sad.
Your implementation should always work, even if I give it a non-tsil. You’re probably calling car on a non-pair. Make sure you check for pairs!

### Problem 2

Can we use some of our procedures to implement others?
Maybe. You cannot use tsil->list to implement tsil-length or tsil-ref.

### Problem 4

Which procedures are we supposed to modify for counting?
You should update quicksort, partition, and random-element so all their calls to car, cdr, etc. are counted. You do not need to count calls to car, cdr, etc. in the random-list procedure. The idea here is that you are counting calls that happen during the execution of quicksort, but not in generating the test input you run quicksort on.
What is the constant c for?
It is a coefficient for the nlogn term in the number of steps that quicksort takes. You can choose some integer value that fits the data.
Do we need to document our counted versions of car, cdr, length, etc?
No!
For part b, should we guess what c is for a sorted list or a random list?
Use the random list as your input.

### General Questions

Do we need to document locally-defined procedures?
No.
Are we allowed to talk to others about general issues from the first few weeks of class, such as the syntax of section?
No. We’ve had too many problems with “slippery slope” issues.
Questions should go to the instructors.
What is a general question?
A question that is about the exam in general, not a particular problem.
Do all the sections have the same exam?
Yes.
Can we still invoke the “There’s more to life” clause if we spend more than five hours on the exam?
Yes. However, we really do recommend that you stop at five hours unless you are very close to finishing. It’s not worth your time or stress to spend more effort on the exam. It is, however, worth your time to come talk to us, and perhaps to get a mentor or more help (not on this exam, but on the class). There’s likely some concept you’re missing, and we can help figure that out.
What do you mean by “implement?”
Write a procedure or procedures that accomplish the given task.
Do we have to make our code concise?
You should strive for readable and correct code. If you can make it concise, that’s a plus, but concision is secondary to readability and correctness. Long or muddled code is likely to lose points, even if it is correct.
Much of your sample 6P-style documentation has incomplete sentences. Can we follow that model? That is, can we use incomplete sentences in our 6P-style documentation?
Yes, you can use incomplete sentences in 6P-style documentation.
You tell us to start the exam early, but then you add corrections and questions and answers. Isn’t that contradictory? Aren’t we better off waiting until you’ve answered the questions and corrected any errors?
We think you’re better able to get your questions answered early if you start early. Later questions will generally receive a response of “See the notes on the exam.”
How do we know what our random number is?
You should have received instructions on how to generate your random number on the day the exam was distributed. If you don’t have a number, ask your professor for one before submitting your exam.
To show we’ve tested the code informally, would you just like us to just post the inputs we used to test the procedure? If so, how should we list those?
Copy and paste the interactions pane into the appropriate place in the definitions pane. Put a #| before the pasted material. Put a |# after the pasted material.
Should we cite our partner from a past lab or assignment if we use code from a past lab or assignment?
You should cite both yourself and your partner, although you should do so as anonymously as possible. For example “Ideas taken from the solution to problem 7 on assignment 2 written by student 641321 and partner.”
If we write a broken procedure and replace it later, should we keep the previous one?
Yes! This will help us give you partial credit if your final procedure isn’t quite right. But make sure to comment out the old one using #| and |# or semicolons. You might also add a note or two as to what you were trying to do.
Do I need to document helper procedures?
You must document helpers using the 4Ps.
Can I use procedures that we have not covered in class?
Can I use list-ref? I’d swear we’ve used it, but I can’t find where.
You may certainly use list-ref.
I discovered a way to define procedures without using lambda that looks like (define (proc params) body). Can I use that?
No.
Is extra credit available on this exam?
No explicit extra credit is available. However, we may find that you’ve come up with such wonderful answers that we cannot help but give you extra credit. (Don’t laugh. It happens almost every time.)
DrRacket crashed and ate my exam. What do I do?
Send your instructor the exam file and any backups that DrRacket seems to have made. (Those tend to have a similar file name, but with pound signs or tildes added at the front or back.)
How do I know when you’ve added a new question or answer?
We try to add a date stamp to the questions and answers.
Should we cite readings from the CSC 151 Web site? If so, how much information should we give?
Yes. The title and URL should suffice.
Can I use other websites as a reference for the exam?
As the exam policies page states, “This examination is open book, open notes, open mind, open computer, open Web.” Note that it also states “ If you find ideas in a book or on the Web, be sure to cite them appropriately.”
Do we have to cite the exam itself?
No.
How do I generate a random six digit number?
(random 1000000). If you end up with a number less than six digits, add zeros to the left side of the number until you have six digits.
Some of the problems are based on a recent homework assignment. Should I cite my work on that assignment?
If you refer to your work on that assignment, you should cite it.
How do I cite my work on an assignment?
Something like “Student 123456 and partner. Homework Assignment 3. CSC 151.01. 6 September 2018.”
I see the comment ; STUB on some of the procedures. What does that mean?
We use the term “STUB” for a procedure that gives some default value rather than the correct one. Stub procedures are useful when you need the procedure to exist during development, but have not yet had time to implement it. You should remove these comments as you replace stub procedures with real implementations.
How exhaustive do our postconditions need to be when writing 6P-style documentation?
Your postconditions are not a test suite, but they should capture all of the procedure’s behavior. You do not need to give specific input/output examples, but describe what is true about the output as much as you can in a few concise lines.
Are there any requirements for informal tests?
Sort of. They should be good tests, though they need not be exhaustive. Your informal tests only count if they are different from the examples on the exam.
Do we need to show examples of running our helper procedures?
No, but you should be checking them carefully. You don’t need to include these examples in your exam.

## Errata

Please check back periodically in case we find any new issues.

### Problem 1

• There was a missing ) in the first example. [No EC. Fixed before end of first day.]
• The word “procedure” was misspelled “procesure”. [+1 point. JL]
• The word “definition” was misspelled “defintion”. [+1 point. TK]

### Problem 2

• The documented name for tsil-length was tsil->length. [+1/2 point. SB]

### Problem 3

• The topics included “sorting”, but this problem is not related to sorting. [+1/2 point. AA]

### Problem 5

• The problem used the singular “course” when it should have used “courses”. [+1 point. CN]
• The has-taken predicate should have had a question mark. I am not updating the problem because that would break your example code if you already downloaded the starter file. Because this is a style rather than correctness issue, there will be no bonus point for this error.

## Citations

Some of the problems on this exam are based on—and at times copied from—problems on previous exams for the course. Those exams were written by Charlie Curtsinger, Janet Davis, Rhys Price Jones, Titus Klinge, Samuel A. Rebelsky, John David Stone, Henry Walker, and Jerod Weinman. Many were written collaboratively, or were themselves based upon prior examinations, so precise credit is difficult, if not impossible.

Some problems on this exam were inspired by conversations with our students and by correct and incorrect student solutions on a variety of problems. We thank our students for that inspiration. Usually, a combination of questions or discussions inspired a problem, so it is difficult and inappropriate to credit individual students.