How Scheme Evaluates Expressions (version 1)
Summary: We consider the algorithm and structures
that the Scheme interpreter users to evaluate Scheme expressions.
Introduction
To many Scheme programmers, the Scheme environment is a bit of a
black box
: You type an expression, the Scheme interpreter
chugs along for a few milliseconds (or perhaps longer) and then spits
out an answer. It is certainly possible to use Scheme without knowing
anything more about the Scheme interpreter. However, we have found that
students use Scheme much better when they know a little bit about the
way in which Scheme interprets expressions. As importantly, when things
go wrong, knowing a bit about the interpreter can help you resolve
problems.
Clearly, it's not appropriate to cover every aspect of the Scheme
interpreter this early in your experience of Scheme. Hence, we
will start with a simple description of the Scheme interpreter and,
when you learn new parts of Scheme, we'll add more information about
the interpreter.
Right now, you know how to enter four kinds of expressions in Scheme:
Simple values: The building blocks of more
complex expressions. Things like 2.4 and
"hello".
Procedure calls: Computations that
involves applying some procedure (such as addition) to some values (such as
1, 2, and 3).
Definitions: The association of a name, such
as homework1, with a value, such as 91,
using the define keyword.
Names of values: A name given by a prior
definition, like the homework1 just described,
or built-in to Scheme, such as pi.
We will consider how Scheme evaluates each in turn.
Structures
When the Scheme interpreter evaluates expressions, it relies on
a data structure to keep track of information. In particular,
a dictionary associates names with values.
The Scheme interpreter uses the dictionary to look up each name that it
encounters.
The Scheme interpreter updates the dictionary when it encounters a
define expression.
Evaluating the Four Kinds of Expressions
Evaluating Simple Values
The evaluation strategy for simple expressions is fairly straightforward:
The Scheme interpreter simply converts the human-readable expression to
some internal representation. If the simple expression is all that's
been typed, it then converts that internal representation back to
human-readable form and prints it out.
How does the Scheme interpreter know that it has a simple value to
evaluate? If the interpreter sees a digit (0-9) or a minus-sign, it
knows that the value is a number, and reads the remaining parts of the
number. If the interpreter sees a quotation mark ("), it knows that it
has a string and reads everything else until the ending quotation mark.
Since those are the only kinds of simple values we know about right now,
those are the only ones we discuss here.
Evaluating Procedure Calls
Evaluating procedure calls is clearly a bit more difficult.
In general, the Scheme interpreter can't apply a procedure to some
arguments until it has evaluated the arguments. Hence, it evaluates
all the arguments (using the same algorithm) and then applies the
procedure.
For example, to evaluate (+ (* 2 3) (* 4 5)), the interpreter
must first evaluate (* 2 3) and (* 4 5). The
results of those evaluations are 6 and 20, which it can then add, giving
26. If the addition was the outermost operation, it can then print the
result. If the addition expression is part of a larger expression, such as
(* 0.25 (+ (* 2 3) (* 4 5))), it uses the
26 in evaluating the enclosing expression.
How does the interpreter know that it has a compound expression to evaluate?
It sees an open parenthesis. Unless the first thing after the open
parenthesis is the keyword define (or one of a few
other special keywords
you'll learn about later), it assumes that the first thing after the
open parenthesis is a procedure, the remaining things are the arguments
to that procedure.
Evaluating Definitions
Definitions are a special case of compound expressions in which
the procedure is define. To evaluate a definition,
the interpreter must evaluate the associated expression. Once it is
done evaluating that expression, it puts the name and the value of
the expression into the dictionary. For example, given
(define x (+ 2 3)), the interpreter evaluates
(+ 2 3), giving 5, and then puts the entry
[x : 5] in the dictionary.
Evaluating Names
As the definition of the dictionary suggests, the evaluation strategy
for names is pretty straightforward: The Scheme interpreter simply looks
them up in the dictionary.
Some Examples
Let's consider a few examples. Suppose we start with the empty dictionary
and evaluate the expression (+ 22 3).
The expression begins with an open parenthesis, so it is either a
procedure call or a definition.
The next thing is not the keyword define, so the whole
expression is a procedure call. The procedure will we apply
is +, the addition procedure.
We need to evaluate each argument.
The first argument begins with 2.
2 is a digit, and therefore starts a number. We read the
number 22, and we're done with that argument.
The second argument begins with 3.
3 is a digit, and therefore starts a number. We read the number
3, and we're done with that argument.
The next thing is a close parenthesis, so we know that we're
done with the argument list.
We now apply the procedure (that is, +) to the two arguments,
(that is, 22 and 3). The result is 25.
We print the result: 25
Now, let's suppose we had a somewhat more complex expression, such as
(* pi (expt (/ diameter 2) 2)). As you might note, that expression computes the area of a circle given its diameter.
The expression begins with an open parenthesis, so it is either a
procedure call or a definition.
The next thing is not the keyword define, so the whole
expression is a procedure call. The procedure we will apply
is *.
We need to evaluate each argument.
The first argument begins with a letter. It is therefore a name
of some sort. We read the whole name, which is pi.
We look up pi in the definitions table. If we are
lucky, it's been predefined. (It is predefined in MediaScript,
so we're comfortable using it.) Let's see ... it's defined
as 3.14159.
The next argument begins with an open parenthesis, so it is either a
procedure call or a definition.
The next thing after the open parenthesis is not the
keyword define, so the whole expression
is a procedure call. The procedure we will apply is
expt.
We need to evaluate the arguments.
The first argument begins with an open parenthesis, so it is
either a procedure call or a definition.
The next thing is not the keyword define, so the whole
expression is a procedure call. The procedure we will apply
is / (division).
We need to evaluate the arguments.
The first argument begins with a letter. It is therefore
a name. In this case, it is the name diameter.
We look up diameter in the table. Hmmm ...
it's not a predefined value (which is probably good, since
it is likely that programs will want to use only one diameter).
But we haven't defined it either. Boom!
It is clearly impossible to continue evaluating the expression,
so we stop.
It's frustrating to do all the preparatory work for no result,
isn't it? That's one of the reasons computers are malicious.
Well, it seems we should have defined diameter first.
Let's do so. (define diameter 10)
The expression begins with an open parenthesis. It is either a
definition or a procedure call.
The next thing to appear is the keyword define. It's
a definition. The name we will be defining is diameter.
We now need to evaluate the expression.
The expression begins with the digit 1, so it must be a number.
We read the number 10, and we're done.
We add the entry [diameter : 10] to the table.
Okay, let's go back and explore the previous expression,
(* pi (expt (/ diameter 2) 2)).
The expression begins with an open parenthesis, so it is either a
procedure call or a definition.
The next thing is not the keyword define, so the whole
expression is a procedure call. The procedure we will apply
is *.
We need to evaluate each argument.
The first argument begins with a letter. It is therefore a name
of some sort. We read the whole name, which is pi.
We look up pi in the definitions table. If we are
lucky, it's been predefined. (It is predefined in MediaScript,
so we're comfortable using it.) Let's see ... it's defined
as 3.14159.
The next argument begins with an open parenthesis, so it is either a
procedure call or a definition.
The next thing after the open parenthesis is not the
keyword define, so the whole expression
is a procedure call. The procedure we will apply is
expt.
We need to evaluate the arguments.
The first argument begins with an open parenthesis, so it is
either a procedure call or a definition.
The next thing is not the keyword define, so the whole
expression is a procedure call. The procedure we will apply
is / (division).
We need to evaluate the arguments.
The first argument begins with a letter. It is therefore
a name. In this case, it is the name diameter.
We look up diameter in the table. Since
we just put [diameter : 10] into the table,
we can be fairly confident that we'll find that entry,
and can therefore use the value 10.
The next argument begins with 2, which is a digit. We
read the complete number, with is 2.
The next thing to appear is a close parenthesis. We must
be done with the arguments to the / procedure..
We're now ready to divide 10 by 2. The result is 5.
As you may recall, we've been processing the arguments to
expt. Do any remain?
The next thing we see is a digit, 2. We read the rest of
the number, which is 2.
The next thing we see is a close parenthesis. We're done
with the arguments. (That's a good thing, since
expt only has two parameters.)
We compute 5 squared, which is 25.
Okay, we're now back to processing the parameter list to the
outermost multiplication. So far, we've seen pi,
which we found was 3.14159, and an expression, which we just
found was 25.
The next thing left is a close paren, which marks the end of
the expression.
We can finally do the multiplication: 3.14159 * 25 is 78.53975
The Evaluation Algorithm
Here's a simplified version of the algorithm that the Scheme interpreter
seems to follow.
Look at the next non-space character
If the next non-space character is an open parenthesis
Look at the next thing.
If the next thing is the keyword define
Read the next thing (a name to define)
Read the next thing (an expression)
Evaluate the expression, giving a value
Add a [name : value] entry to the dictionary
Otherwise the next thing must be a function
Read and evaluate each argument
Apply the function to the evaluated arguments
Otherwise, see if the next non-space character is a digit (or + or -)
Read all the parts of a number
The number is the result
Otherwise, see if the next non-space character is a letter
Read everything up to the next space (or close paren).
Look up the thing just read in the dictionary.
If it's not there, crash and burn
Otherwise, the result is the value found in the dictionary
A Slight Complexity: Naming Functions
Of course, things are not quite as simple as the discussion above suggests
(and it's not even clear that the discussion above suggests things are
simple). For example, there are a host of kinds of Scheme expressions that
we have not yet explored. We will consider those in future discussions
of Scheme's evaluation strategies.
However, we've left off one thing that's relevant to the Scheme you
already know. As you've seen, Scheme lets you write definitions for
functions, such as (define multiply *). What does that
do to our evaluation strategy? Primarily, it means that we have to
check if the thing we see after an open parenthesis is a defined name
and, if so, get its definition from the dictionary.
In fact, Scheme looks up the procedure, even if you're using one of the standard
procedures. Hence, if you use + or sqrt, it still
looks it up in the table, where it finds the internal representation of
the corresponding procedure. You can tell that Scheme looks up these standard
procedures by redefining them. For example, (define + -) will make your
program behave very strangely thereafter.