| Schedule | Readings | Labs | Homework | Mechanics | Contact | |
| CSC 151-01, 2007S » Reading 13 » Procedures as Contracts | ||||||
Summary: We consider the constraints we might place upon procedures and mechanisms for expressing those constraints.
Contents:
Several of the Scheme procedures that we have written or studied in
preceding labs presuppose that their arguments will meet specific
preconditions -- constraints on the types or
values of its
arguments. For instance, we saw in the first
lab on recursion that the
greatest-of-list procedure requires its
argument to be a
non-empty list of real numbers. If some careless programmer invokes
greatest-of-list and gives it, as an argument,
the empty list,
or a list in which one of the elements is not a real number, or perhaps
even some Scheme value that is not a list at all, the computation that
the
definition of greatest-in-list describes
cannot be completed.
> (greatest-of-list null)
cdr: expects argument of type <pair>; given ()
> (greatest-of-list (list 1 2 3 'a))
max: expects type <real number> as 2nd argument, given: a; other arguments were: 3
> (greatest-of-list 2)
cdr: expects argument of type <pair>; given 2
As you can see, none of these error messages are particularly helpful. Whose responsibility is it to handle these types of errors? As we will see, it is possible to share responsibility between the person who writes a procedure and the person who calls a procedure.
A procedure definition is like a contract between the author of the
definition and someone who invokes the procedure. The
postconditions of the procedure are what the
author guarantees:
When the computation directed by the procedure is finished, the
postconditions shall be met. Usually the postconditions are constraints
on the value of the result returned by the procedure. For instance, the
postcondition of the square procedure,
(define square
(lambda (val)
(* val val)))
is that the result is the square of the argument val.
The preconditions are the guarantees that the invoker of a procedure
makes
to the author, the constraints that the arguments shall meet. For
instance, it is a precondition of the square
procedure that
val is a number.
If the invoker of a procedure violates its preconditions, then the
contract is broken and the author's guarantee of the postconditions is
void. (If
val is, say, a list of symbols, then the
author can't very
well guarantee to return its square.) To make it less likely that an
invoker violates a precondition by mistake, it is usual to document
preconditions carefully and to include occasional checks in one's
programs,
ensuring that the preconditions are met before starting a complicated
computation.
Many of DrScheme's primitive procedures have such preconditions, which they enforce by aborting the computation and displaying a diagnostic message when the preconditions are not met:
> (/ 1 0)
/: division by zero
> (log 0)
log: undefined for 0
> (length 116)
length: expects argument of type <proper list> given 116
To enable us to enforce preconditions in the same way, DrScheme
provides a
procedure named error, which takes a string
as its argument.
Calling the error procedure aborts the entire
computation of
which the call is a part and causes the string to be displayed as a
diagnostic message.
For instance, we could enforce greatest-of-list's
precondition that its parameter be a non-empty list of reals as by
rewriting its definition thus:
(define greatest-of-list
(lambda (ls)
(if (or (not (list? ls))
(null? ls)
(not (all-real? ls)))
(error "greatest-of-list: requires a non-empty list of reals")
(if (null? (cdr ls))
(car ls)
(max (car ls) (greatest-of-list (cdr ls)))))))
where all-real? is a predicate that we have
to write that
takes any list as its argument and determines whether or not all of the
elements of that list are real numbers:
Now the greatest-of-list procedure enforces
its precondition:
> (greatest-of-list 139)
greatest-of-list: requires a non-empty list of reals
> (greatest-of-list null)
greatest-of-list: requires a non-empty list of reals
> (greatest-of-list (list 71/3 -17 23 'oops 16/15))
greatest-of-list: requires a non-empty list of reals
Including precondition testing in your procedures often makes them markedly easier to analyze and check, so I recommend the practice, especially during program development. There is a trade-off, however: It takes time to test the preconditions, and that time will be consumed on every invocation of the procedure. Since time is often a scarce resource, it makes sense to save it by skipping the test when you can prove that the precondition will be met. This often happens when you, as programmer, control the context in which the procedure is called as well as the body of the procedure itself.
For example, in the preceding definition of greatest-of-list,
although it is useful to test the precondition when the procedure is
invoked from outside
by a potentially irresponsible
caller, it is a
waste of time to repeat the test of the precondition for any of the
recursive calls to the procedure. At the point of the recursive call,
you
already know that ls is a list of strings
(because you tested
that precondition on the way in) and that its cdr is not empty (because
the
body of the procedure explicitly tests for that condition and does
something other than a recursive call if it is met), so the cdr must
also
be a non-empty list of strings. So it's unnecessary to confirm this
again
at the beginning of the recursive call.
One solution to this problem is to replace the definition of
greatest-of-list with two separate procedures,
a husk
and
a kernel
. The husk interacts with the outside
world, performs the
precondition test, and launches the recursion. The kernel is supposed
to be
invoked only when the precondition can be proven true; its job is to
perform the main work of the original procedure, as efficiently as
possible:
(define greatest-of-list
(lambda (ls)
; Make sure that ls is a non-empty list of real numbers.
(if (or (not (list? ls))
(null? ls)
(not (all-real? ls)))
(error "greatest-of-list: requires a non-empty list of reals")
; Find the greatest number in the list.
(greatest-of-list-kernel ls))))
(define greatest-of-list-kernel
(lambda (ls)
(if (null? (cdr ls))
(car ls)
(max (car ls) (greatest-of-list-kernel (cdr ls))))))
The kernel has the same preconditions as the husk procedure, but does not need to enforce them, because we invoke it only in situations where we already know that the preconditions are satisfied.
The one weakness in this idea is that some potentially irresponsible caller might still call the kernel procedure directly, bypassing the husk procedure that he's supposed to invoke. Later this week, we'll see that there are a few ways to put the kernel back inside the husk without losing the efficiency gained by dividing the labor in this way.
When programmers write code, they also write documentation that explains the code. The computer certainly doesn't need any such documentation (and even ignores it), so why should one take the time to write documentation? There are a host of reasons.
sqrt,
list-ref, and a host of other procedures
without understanding how they are defined.)
As all three examples suggest, when we write code, we write not just for the computer, but also for a human reader. Even the best of code needs to be checked again on occasion, and lots of code gets modified for new purposes. Good documentation helps those who must support or modify the code understand it.
As the third example suggests, documentation also helps programmers who
use your code (we sometime call these client programmers
).
In many
ways, the documentation you write for such programmers is the most
important
documentation you can write.
Different organizations have different styles of documentation. After a large number of years documenting procedures and teaching students to document procedures, I've developed a style that I'm happy with and that I find helps students think carefully about their work. (I've also received a few notes from folks in industry who see this documentation and praise me for teaching it to students.)
To keep it easy to remember what belongs in the documentation for a
procedure,
I say that students should focus on the six P's
:
Procedure, Parameters,
Purpose, Produces, Preconditions, and Postconditions.
The procedure section simply names the procedure. The name of the procedure should be obvious from the code. But, by including the name in the documentation, we make it possible for the client programmer to learn about the procedure by reading only the documentation.
The parameters section names the parameters to the procedure and gives them types. For example, if a procedure operates only on numbers, the parameters section should indicate so.
The purpose section briefly describes what
the procedure is supposed to do. The sentences need not be as precise
as what you'd give a computer, but they should be clear to the average
programmer. (As you've learned in your writing, write to your
audience.)
The produces section names and types the result of the procedure. Often, the result is not named in the code of the procedure. So why do we both to include such a section? Because naming the result lets us discuss it (either in the purpose above or in the preconditions and postconditions below).
These first four P's give an informal definition of what the procedure does. The last two P's give a much more formal definition of the purpose of the procedure. You've seen at the beginning of this reading that the preconditions are the conditions that must be met in order for the procedure to work and that preconditions and postconditions are a form of contract. Since they are a contract, we do our best to specify them formally.
The preconditions section provides additional
details
on the valid inputs the procedures accepts and the state of the
programming system that is necessary for the procedure to work
correctly. For example, if a procedure depends on a value being named
elsewhere, the dependency should be named in the preconditions section.
The preconditions section can be used to include restrictions that
would
be too cumbersome to put in the parameters section. For example, in
many
programming languages, there is a cap on the size of integers. In such
languages, it would therefore be necessary for a square
procedure to put a cap on the size of the input value to have an
absolute
value less than or equal to the square root of the largest integer.
When documenting your procedures, you may wish to note whether a precondition is verified (in which case you should make sure to print an error message) or unverified (in which case you may still crash and burn, but the error will come from one of the procedures you call).
The postconditions section provides a more formal description of the results of the procedure. For example, we might say informally that a procedure reverses a list. In the postconditions section, we provide mathematical formulae that indicate what it means to have reversed the list.
Typically, some or all of the preconditions and postconditions are expressed with formulae or code.
Let us first consider a simple procedure that squares its input value and that restricts that value to an integer. Here is one possible set of documentation for an environment in which there is a limit to the size of integers.
;;; Procedure:
;;; square
;;; Parameters:
;;; val, an integer
;;; Purpose:
;;; Computes the square of val.
;;; Produces:
;;;; val-squared, an integer
;;; Preconditions:
;;; MAXINT is defined and is the size of the largest possible integer.
;;; val <= (sqrt MAXINT)
;;; Postconditions:
;;; val-squared = val*val
You will note that the preconditions specified are those described in
the
narrative section: We must ensure that val is
not too large
and we must ensure that the value we use to cap val
is defined.
Let us then turn to the famous reverse
procedure, a procedure
that reverses the order of elements in the list.
;;; Procedure:
;;; reverse
;;; Parameters:
;;; lst, a list
;;; Purpose:
;;; Reverses lst.
;;; Produces:
;;; reversed, a new list.
;;; Preconditions:
;;; None.
;;; Postconditions:
;;; Let n be the length of lst
;;; For all i such that 0 <= i < n
;;; (list-ref reversed i) equals (list-ref lst (- n i 1))
There are many things to note about this definition. First, we don't
list any preconditions, even though we require that lst
is a list. We can avoid other preconditions because the types specified
in the parameters section are included implicitly. Next, the
postconditions
are stated quite formally, with a combination of mathematical and
Scheme notation. Using a formal notation helps avoid ambiguities.
Finally, we don't specify every postcondition. For example,
reverse has no effect on its parameters.
Typically, unless
a postcondition specifies something happens, we assume that the thing
does not happen (in particular, we assume that procedures do not modify
their parameters).
As noted above, the preconditions and postconditions form a contract with the client programmer: If the client programmer meets the type specified in the parameters section and the preconditions specified in the preconditions section, then the procedure must meet the postconditions specified in the postconditions section.
As with all contracts, there is therefore a bit of adversarial balance between the preconditions and postconditions. The implementer's goal is to do as little as possible to meet the postconditions, which means that the client's goal is to make the postconditions specify the goal in such a way that the implementer cannot make compromises. Similarly, one of the client's goals may be to break the procedure (so that he may sue or reduce payment to the implementer), so the implementer needs to write the preconditions and parameter descriptions in such a way that she can ensure that any parameters that meet those descriptions can be used to compute a result.
Although we typically suggest using the basic six P's (procedure, parameters, purpose, produces, preconditions, and postconditions) to document your procedures, there are a few other optional sections that you may wish to include. For the sake of alliteration, we also begin those sections with the letter P.
In a Practica section, you can give some sample interactions with your procedure. We find that many programmers best understand the purpose of programs through examples, and the Practica section gives you the opportunity to give clarifying examples.
In a Problems section, you might note special
cases or
issues that are not sufficiently covered. Typically, all the
problems are handled by eliminating invalid inputs in the preconditions
section, but until you have a carefully written preconditions section,
the problems section provides additional help (e.g., the
behavior
of this procedure is undefined if the parameter is 0
).
In a Philosophy section, you can discuss the broader role of this procedure. Often, procedures are written as part of a larger collection. This section gives you an opportunity to specify these relationships.
In a Process section, you can discuss a bit about the algorithm you use to go from parameters to product. In general, the client should not need to know your algorithm, but sometimes it is helpful to reveal a bit about the algorithm.
You may find other useful P's as your program. Feel free to introduce them into the documentation. (Feel free to use terms that do not begin with P, too.)
Janet Davis (davisjan@cs.grinnell.edu)
Created February 15, 2007 based on