Recursion with Helper Procedures
Summary:
You've now seen a bit of the basics of recursion as a program design
strategy. We now consider an alternate technique for writing recursive
procedures, one in which we build helpers that carry along a partial
result at each step. We also see how such helpers can be used to
create tail recursive solutions, which are often
more efficient than more basic recursive procedures.
Introduction
There are a few basic principles to writing recursive functions. As you
may recall from the
first reading on recursion, a recursive procedure is one that
calls itself. To successfully write a recursive procedure, you need
a base case, in which you compute the result
without a
recursive call;
a recursive case, in which you compute a partial
result recursively, and then expand it into the complete result; and
a test, in which you decide whether to use the
base case or the recursive case.
A disadvantage of traditional recursion is that it is difficult to
tell what is happening along the way. In the particular case of list
recursion, we typically work all the way down to the end of the list,
and then work our way back to the front of the list to compute a
final result. When they encounter this issue, many students ask,
Why not just compute the result as we go? We
consider that suggestion in this reading.
Review: Scheme's Evaluation Strategy
Before we go much further, let's take a quick detour into an
important concept that we've discussed before: When Scheme has a
nested expression to evaluate, how does it evaluate that expression?
The technique it uses is pretty straightforward: In order to apply
a procedure to parameters, those parameters must be in simple form
(that is, values, rather than expressions to compute values). Hence,
given (proc exp1
exp2 exp3),
Scheme first evaluates exp1,
exp2, and exp3, looks
up proc in the environment, and then applies the
procedure it finds to the three resulting values.
In what order does it evaluate exp1,
exp2, and exp3? It turns
out that it depends on the implementation of Scheme. In our sample
code, we'll always assume that the Scheme interpreter evaluates the
parameters from left to right.
Of course, for some operations (which we call keywords
or syntax
), the Scheme interpreter uses a
different evaluation strategy. For example, when evaluating an
if expression, the interpreter does not evaluate
all three parameters. Instead, it evaluates them in a specified order
(first the test, then only one of the consequent and alternate).
Similarly, when evaluating an and or an
or, the interpreter evaluates the parameters
one at a time, stopping as soon as it knows an answer (in the case of
and, when it hits a #f; in the case
of or, when it hits a #t).
But there's more. To understand how Scheme evaluates expressions, we
must also understand how it applies the procedures you define and how
it processes definitions in general.
The naming scheme that Scheme uses is relatively straightforward.
Scheme keeps a table of correspondences between names and values.
For each name, it stores a reference to the value,
which means that two names can refer to the same value. (This is
important to know, because if you change a value with a side-effecting
function, then all references to that value seem to change.) If you
write another definition using the same name, it changes the reference,
but not the underlying value. When Scheme sees a name while evaluating
an expression, it looks up the name in the table, and uses the value
the name refers to.
The way procedures work takes advantage of this naming scheme.
When you call a procedure on some arguments, the Scheme interpreter
first updates the name table, mapping each name in the lambda to the
corresponding argument. For example, if you apply (lambda (a b
c) ...) to 3, 1, and 6, the table will now have a
refer to 3, b refer to 1, and c refer to 6.
After updating the table, the interpreter evaluates the body of the
procedure. When it finishes evaluating the body, it cleans up the
name table, removing any new entries it had added.
Why is the way Scheme evaluates and understands expressions important
for recursion? First, it may help you understand how recursion works.
Because if (or cond), delays the evaluation of
the consequent and alternate, we can safely have a procedure call itself
without worrying about it calling itself again and again and again,
ad infinitum. Second, it clarifies how we can have the same names (the
parameters) mean different things at different times.
Recursive Sum, Re-Examined
In case you've forgotten the example from the prior reading, let's look
at how the first version of sum computes the sum
of the list with elements 38, 12, 93. (To keep the example concise,
we leave out the computations involving car and
cdr.)
(sum (cons 38 (cons 12 (cons 83 null))))
=> (if (null? (cons 38 ...)) 0 (+ 38 (sum (cons 12 (cons 83 null)))))
=> (+ 38 (sum (cons 12 (cons 83 null)))))
=> (+ 38 (if (null? (cons 12 ..)) 0 (+ 12 (sum (cons 83 null)))))
=> (+ 38 (+ 12 (sum (cons 83 null))))
=> (+ 38 (+ 12 (if (null? (cons 83 ...)) 0 (+ 83 (sum null)))))
=> (+ 38 (+ 12 (+ 83 (sum null))))
=> (+ 38 (+ 12 (+ 83 (if (null? null) 0 (+ (car null) (sum (cdr null)))))))
=> (+ 38 (+ 12 (+ 83 0)))
=> (+ 38 (+ 12 83))
=> (+ 38 95)
=> 133
If we pull out the tests (which are fairly straightforward), we can
think of the computation as follows.
(sum (cons 38 (cons 12 (cons 83 null))))
=> (+ 38 (sum (cons 12 (cons 83 null)))))
=> (+ 38 (+ 12 (sum (cons 83 null))))
=> (+ 38 (+ 12 (+ 83 (sum null))))
=> (+ 38 (+ 12 (+ 83 0)))
=> (+ 38 (+ 12 83))
=> (+ 38 95)
=> 133
An Alternative Sum
Now, let's return to our goal: Looking at other ways to do recursion.
We'll consider the question of computing results as we go with the
summation example from the previous reading. As you may have noted, an
odd thing about the sum procedure is that it works
from right to left. Traditionally, we sum from left to right. Can we
rewrite sum to work from left to right? Certainly, but
we may need a helper procedure (another procedure
whose primary purpose is to assist our current procedure) to do so.
If you think about it, when you're summing a list of numbers from left
to right, you need to keep track of two different things:
The sum of the values seen so far.
The remaining values to sum.
Hence, we'll build our helper procedure with two parameters,
sum-so-far and remaining.
We'll start the body with a template for recursive procedures (a test
to determine whether to use the base case or recursive case, the base
case, and the recursive case). We'll then fill in each part.
(define new-sum-helper
(lambda (sum-so-far remaining)
(if (test)
base-case
recursive-case)))
The recursive case is fairly easy. Recall that since
remaining is a list, we can split it
into two parts, the first element (that is, the car), and the
remaining elements (that is, the cdr). Each part contributes
to one of the parameters of the recursive call. We update
sum-so-far by adding the first element of
remaining to sum-so-far.
We update remaining by removing the
first element. To continue
, we simply call
new-sum-helper again with those updated parameters.
(define new-sum-helper
(lambda (sum-so-far remaining)
(if (test)
base-case
(new-sum-helper (+ sum-so-far (car remaining))
(cdr remaining)))))
The recursive case then gives us a clue as to what to use for the test.
We need to stop when there are no elements left in the list. (Otherwise,
taking the car and the cdr would create errors.)
(define new-sum-helper
(lambda (sum-so-far remaining)
(if (null? remaining)
base-case
(new-sum-helper (+ sum-so-far (car remaining))
(cdr remaining)))))
We're almost done. What should the base case be? In the previous version,
it was 0. However, in this case, we've been keeping a running sum. When
we run out of things to add, the value of the complete sum is the value
of the running sum.
Now we're ready to write the primary procedure whose responsibility
it is to call new-sum-helper. Like
sum, new-sum will take a list
as a parameter. That list will become remaining.
What value should sum-so-far begin with?
Since we have not yet added anything when we start, it begins at 0.
Putting it all together, we get the following.
Note that it is a bit difficult to write the documentation for the
helper procedure, since we have to think about the role of the
sum-so-far parameter in the context of the
overall goal. At the same time, doing so helps us clarify exactly
what we expect the procedure to do. It also helps us verify that the
base case is appropriate. In this case, since we're supposed to add
the intermediate sum to the sum of the remaining values, when there are
no remaining values, we can just add 0 and stop.
Watching New Sum
Does this change make a difference in the way in which the sum is
evaluated? Let's watch.
(new-sum (cons 38 (cons 12 (cons 83 null))))
=> (new-sum-helper 0 (cons 38 (cons 12 (cons 83 null))))
=> (if (null? (cons 38 ...)) 0 (new-sum-helper (+ 0 38) (cdr (cons 38 ...))))
=> (new-sum-helper (+ 0 38) (cons 12 (cons 83 null)))
=> (new-sum-helper 38 (cons 12 (cons 83 null)))
=> (if (null? (cons 12 ...)) 38 (new-sum-helper (+ 38 12) (cdr (cons 12 ...))))
=> (new-sum-helper (+ 38 12) (cons 83 null))
=> (new-sum-helper 50 (cons 83 null))
=> (if (null? (cons 83 null)) 50 (new-sum-helper (+ 50 83) (cdr (cons 83 null))))
=> (new-sum-helper (+ 50 83) null)
=> (new-sum-helper 133 null)
=> (if (null? null) 133 (new-sum-helper (+133 (car null)) (cdr null)))
=> 133
If we don't show the conditional steps, we get the following as the
basic structure of the recursion.
(new-sum (cons 38 (cons 12 (cons 83 null))))
=> (new-sum-helper 0 (cons 38 (cons 12 (cons 83 null))))
=> (new-sum-helper (+ 0 38) (cons 12 (cons 83 null)))
=> (new-sum-helper 38 (cons 12 (cons 83 null)))
=> (new-sum-helper (+ 38 12) (cons 83 null))
=> (new-sum-helper 50 (cons 83 null))
=> (new-sum-helper (+ 50 83) null)
=> (new-sum-helper 133 null)
=> 133
Note that the intermediate results for new-sum
were different, primarily because new-sum operates from
left to right rather than right to left.
The Technique, Generalized
Okay, how does one write one of these procedures that accumulates a
result as you go? First, you build a
helper procedure that takes as parameters (1) the list you're recursing
over, which we've called remaining, (2) a new
parameter, ____-so-far, and (3) any other useful
parameters. The helper recurses over the list, updating that list to
its cdr at each step and updates ____-so-far
to be the next partial result. At the end, the procedure typically returns
____-so-far or something computed from that
value.
After creating the helper, you build the main procedure, which calls the
helper with an appropriate initial ____-so-far
and remaining. For example,
(define proc
(lambda (lst)
(proc-helper initial-value lst)))
(define proc-helper
(lambda (so-far remaining)
(if (null? remaining)
so-far
(proc-helper (combine so-far (car remaining))
(cdr remaining)))))
However, as we'll see in the next section, you sometimes need to
use different initial values for so-far and
remaining.
Another Application: Difference
We've now seen two strategies for doing recursion: We can combine the result
of a recursive call into a final result, or we can pass along intermediate
results as we recurse, and stop with the final result. Which should you
use? For many applications, it doesn't matter. However, there are times
that one technique is much more appropriate than the other. Consider a
procedure that takes a list of numbers,
n1,
n2,
n3, ...
nk-1,
nk,
and computes
n1 -
n2 -
n3 - ... -
nk-1 -
nk.
Suppose we use the first technique. We might write:
Unfortunately, this procedure doesn't quite work as you might expect.
Consider the computation of (difference (list 1 2 3)).
As you may recall, subtraction is left associative, so this should be
(1-2)-3, or -4. Let's see what happens.
(difference (cons 1 (cons 2 (cons 3 null))))
=> (- 1 (difference (cons 2 (cons 3 null))))
=> (- 1 (- 2 (difference (cons 3 null))))
=> (- 1 (- 2 (- 3 (difference null))))
=> (- 1 (- 2 (- 3 0)))
=> (- 1 (- 2 3))
=> (- 1 -1)
=> 2
Hmmm ... that's not quite right. So, let's try the new technique of using
a helper.
Okay, what happens here?
(new-difference (list 1 2 3))
=> (new-difference-helper 0 (cons 1 (cons 2 (cons 3 null))))
=> (new-difference-helper (- 0 1) (cons 2 (cons 3 null)))
=> (new-difference-helper -1 (cons 2 (cons 3 null)))
=> (new-difference-helper (- -1 2) (cons 3 null))
=> (new-difference-helper -3 (cons 3 null))
=> (new-difference-helper (- -3 3) null)
=> (new-difference-helper -6 null)
=> -6
Nope. Still incorrect. So, what happened? We started with 0,
then we subtracted 1, then we subtracted 2, then we subtracted 3.
It looks like we did those last two subtractions in order. But wait!
Why are we subtracting 1? That's the value we're supposed to
subtract everything else from. That means we need to choose a better
initial difference-so-far and remaining.
Why not make the car the difference-so-far and the cdr
remaining?
Does this work?
(newer-difference (list 1 2 3))
=> (new-difference-helper 1 (cons 2 (cons 3 null)))
=> (new-difference-helper (- 1 2) (cons 3 null))
=> (new-difference-helper -1 (cons 3 null))
=> (new-difference-helper (- -1 3) null)
=> (new-difference-helper -4 null)
=> -4
It works! Well, at least it works for this example. In the lab,
we'll explore whether it works for other examples, too.
Watching Recursive Helpers
Sometimes, students ask Can I watch the recursion?
Unfortunately, it is difficult to make Scheme show you all the steps
in a typical recursive procedure, there is no way to access the
delayed actions. But helper recursion makes life a bit easier, since
we typically don't have delayed actions. If you want to watch what
is happening in a helper recursive procedure, you can add commands
to print things out (immediately after the lambda).
For example, here is an annotated version of the last version of the
difference procedure.
Here's some sample output.
> (annotated-difference (list 1 2 3 4 5))
-13
(difference-helper 1 (2 3 4 5))
(difference-helper -1 (3 4 5))
(difference-helper -4 (4 5))
(difference-helper -8 (5))
(difference-helper -13 ())
Terminology: Tail Recursion
You have probably noticed an important difference between the first form
of recursion we used and the helper version: In the original form, once
we reached the base case, we had to back up and do more computation.
In this new form, it seems that once we reach the base case, we're done.
Behind the scenes for the original form, the Scheme interpreter has
to spend extra effort to remember the extra work to be done after the
base case is reached. In this new form, it does not.
When the computation in a recursive procedure is completed when
it reaches the base case, the procedure is called tail
recursive. A tail-recursive version of a procedure is often
(but not always) more efficient than the non-tail-recursive version.
Hence, many Scheme programmers regularly convert their procedures to
tail-recursive form. How do you convert to this form? Most frequently,
you use the techniques in this reading: You create a recursive helper
procedure that accumulates the value to be computed.
Recursion and Efficiency
When we write recursive procedures carelessly, we can create very
inefficient computation. Let us consider, for a moment, the problem
of computing the brightest color in a list of colors.
Before writing any code, we should, of course write the introductory
documentation.
Now we are ready to write some code. Let us first consider a
basic recursive solution. We start with the template of a
recursive procedure.
(define rgb-brightest
(lambda (colors)
(if (test)
base-case
recursive-case)))
The test and base case are straightforward: If there is only one value left
in the list, it is clearly the brightest color.
(define rgb-brightest
(lambda (colors)
(if (null? (cdr colors))
(car colors)
recursive-case)))
But what about the recursive case? Suppose we compute the brightest
color in the remainder of the list. What does that tell us about the
brightest color in the list? Well, the brightest color is either the
brightest color in the remainder or the first color, whichever is
brighter. We can express that choice as a second conditional.
(if (>= (rgb-brightness (car colors))
(rgb-brightness (rgb-brightest (cdr colors))))
(car colors)
(rgb-brightest (cdr colors)))
When we plug that into the rest of the procedure, we get
Most Scheme programmers, when confronted with these nested if statements,
will turn them into a cond statement.
Both of these solutions are correct. However, both are quite inefficient.
What makes them inefficient? The two recursive calls. Normally,
you wouldn't think that two calls would make a huge difference (well,
maybe a factor of two). However, for recursive procedures, the expense
of the extra call grows exponentially.
Consider the particular case of
of a list of five elements in which the last element is the brightest.
In the test, we recurse on the list of four elements in the test. When that
recursive call finishes, we recurse again on the same list in order to
compute the alternate. Now, let us consider one of those identical
recursive calls. Once again, the test requires us to recurse on the
cdr (the list of three elements). And, once again, we need to recurse
on that same list hen we compute the alternate. Since we had to find
the brightest element of the list of four elements twice, we need to find the
brightest element of the list of three elements four times.
Now, each of those computations (that is, finding the brightest in a
list of three elements), needs to recurse on a list of two elements
twice, once for the test and once for the alternate. Given that we
found the brightest element of the same list of three elements four times,
we find the brightest element of the same list of two elements eight times.
We're almost done with the analysis. For each of those identical
computations of the brightest element in a list of two elements, we
find the brightest element in the singleton list twice (once for the
test, once for the alternate). Since there are eight calls
involving the list of two elements, there are sixteen calls
involving the list of one element.
How calls to rgb-brightest did we end up with?
Let's see ... one for the list of five elements, two for the sublist of
four elements, four for the sublist of three elements, eight for the
sublist of two elements, and sixteen for the sublist of one element.
1 + 2 + 4 + 8 + 16 is 31. That's a lot of calls for a list of five
colors. If we had a list of six colors, it could take 63 calls. If
we had a list of ten colors, it could take over one thousand calls.
You would hope that we could come up with a solution that does not
require many more recursive calls than the elements of the list. And
such a solution is possible with a helper. This time, we'll keep track
of the brightest color we've seen so far. We'll model the solution on
difference, which had the following form:
(define proc
(lambda (lst)
(proc-helper (car lst) (cdr lst))))
(define proc-helper
(lambda (so-far remaining)
(if (null? remaining)
so-far
(proc-helper (combine so-far (car remaining))
(cdr remaining)))))
We'll start by plugging in appropriate names.
(define rgb-brightest
(lambda (colors)
(rgb-brightest-helper (car colors) (cdr colors))))
(define rgb-brightest-helper
(lambda (brightest-so-far colors-remaining)
(if (null? colors-remaining)
brightest-so-far
(rgb-brightest-helper
(combine brightest-so-far (car remaining-colors))
(cdr remaining-colors)))))
Now, how do we compute a new brightest color so far, given a previous
brightest color so far and one additional color? If the previous color
is brighter, we keep it. If the new color is brighter, we use it.
This solution is both correct and efficient, but some folks have trouble
reading that conditional in the middle. We can clarify the code by
writing a procedure, (rgb-brighter
color1 color2),
that finds the brighter of two colors. Once we've written
rgb-brighter, we can rewrite the helper as
Now, let's consider what happens when we try to find the brightest color
in the list of black, blue, red, green, and white. (We'll use the names
in the example. Scheme would use their numeric values.)
(rgb-brightest (list black blue red green white))
=> (rgb-brightest-helper black (list blue red green white))
=> (rgb-brightest-helper (rgb-brighter black blue) (list red green white))
=> (rgb-brightest-helper blue (list red green white))
=> (rgb-brightest-helper (rgb-brighter blue red) (list green white))
=> (rgb-brightest-helper red (list green white))
=> (rgb-brightest-helper (rgb-brighter red green) (list white))
=> (rgb-brightest-helper green (list white))
=> (rgb-brightest-helper (rgb-brighter green white) (list))
=> (rgb-brightest-helper white (list))
=> white
Not bad. About five recursive calls (plus four calls to
rgb-brighter). And, the process looks clear
enough that we can be fairly confident that with a list of size
ten, this will take only ten recursive calls (plus nine calls to
rgb-brighter).
Of course, for this new version to work, we need to have
rgb-brighter defined. But that is straightforward.
(You may even have it in your library already.)
Of course, once we've defined rgb-brighter, we
might want to revisit the design of our first recursive version.
You may recall that we'd gotten to this following stage in our
design:
(define rgb-brightest
(lambda (colors)
(if (null? (cdr colors))
(car colors)
recursive-case)))
Now, what goes in the recursive case? We said
Suppose we compute the brightest
color in the remainder of the list ... the brightest color is either the
brightest color in the remainder or the first color, whichever is
brighter.
We then expressed that decision as a conditional.
Now we can express it using rgb-brighter.
Now, let's trace what this procedure does on the same list.
(rgb-brightest (list black blue red green white))
=> (rgb-brighter black (rgb-brightest (list blue red green white)))
=> (rgb-brighter black (rgb-brighter blue (rgb-brightest (list red green white))))
=> (rgb-brighter black (rgb-brighter blue (rgb-brighter red (rgb-brightest (list green white)))))
=> (rgb-brighter black (rgb-brighter blue (rgb-brighter red (rgb-brighter green (rgb-brightest (list white))))))
=> (rgb-brighter black (rgb-brighter blue (rgb-brighter red (rgb-brighter green white))))
=> (rgb-brighter black (rgb-brighter blue (rgb-brighter red white)))
=> (rgb-brighter black (rgb-brighter blue white))
=> (rgb-brighter black white)
=> white
In terms of computation, this seems about the same as the second
helper version: We did five recursive calls and four calls to
rgb-brighter. However, we see that the calls
to rgb-brighter get stacked up, which makes it a
bit more difficult for the interpreter.
What should you take away from this discussion? A number of things.
First, if a recursive procedure has more than one recursive call per
datum, it is likely to be absurdly slow. (Later in the semester,
we'll see some times in which multiple recursive calls are okay.
However, such situations are rare.) Second, as you've already learned,
there is often more than one way to solve the same problem. We've seen
five or so different solutions to the same problem. Third, once you've
written a procedure, it can be useful to step back and think about how
you can make it clearer. Finally, standard recursion tends to create
a backlog of delayed actions, which are finally performed when the base
case is reached. In contrast, tail recursion creates no such backlog.