In testing, the main goal should be to try to find all possible errors; that is, one should try to find test cases which might break the code. Note this is a different perspective than seeing if one can find many test cases that work correctly: if code works correctly for in one case, then the code may work correctly for many similar cases. Overall, one should identify and test as many types of circumstances as possible. Section 2.4 of the text provides some ideas about how such test cases might be chosen.
While the initial running of a program has been known to produce helpful and correct results, your past programming experience probably suggests that some errors usually arise somewhere in the problem-solving process. Specifications may be incomplete or inaccurate, algorithms may contain flaws, or the coding process may be incorrect. Edsger Dijkstra, a very distinguished computer scientist, has observed¹ that in most disciplines such difficulties are called errors or mistakes, but that in computing this terminology is usually softened, and flaws are called bugs. (It seems that people are often more willing to tolerate errors in computer programs than in other products.)²
Novice programmers sometimes approach the task of finding and correcting an error by trial and error, making successive small changes in the source code (``tweaking'' it), and reloading and re-testing it after each change, without giving much thought to the probable cause of the error or to how making the change will affect its operation. This approach to debugging is ineffective, for two reasons:
Tweaking is time-consuming. Novice programmers tend to have a naive confidence that the next small change in the source code, whatever it is, will fix the problem. This is seldom the case. If you detect an error in a procedure, and the first tweak doesn't fix it, the next twelve tweaks probably won't either -- so don't bother with them. Push yourself away from the keyboard and study the context. Don't make even one more change in the source code until you're ready to test a well-thought-out hypothesis about the cause of the error. (This is also a good time to make a separate copy of the procedure, in XEmacs, so that you can backtrack to the current version if subsequent experimentation requires extensive temporary rewriting.)
Tweaking usually fixes only a specific, local problem. Very often an error is a symptom of a general misunderstanding on the part of the programmer, one that affects the operation of the procedure in cases other than the one being tested. Unless you address this general problem, tweaking a procedure in such a way that it passes the particular test that it formerly failed is likely to make your program worse instead of better.
In the course of the semester, we'll explore several alternatives to this approach to debugging. This laboratory exercises reviews a ``software tool'' that the designer of Chez Scheme provided as an aid to debugging: the Chez Scheme interactive inspector. This tool collects information about the context in which an error has occurred, so the programmer can understand what events led to the error. In other words, the interactive inspector does not fix errors, but rather tries to put you in a better position to fix them.
To investigate the Chez Scheme debugger, we consider an example.
Start up Chez Scheme and give it the following procedure definition:
(define quot-and-rem
(lambda (dividend divisor)
(list (quotient dividend divisor)
(remainder dividend divisor))))
This procedure takes two arguments, both integers, and divides the first one by the second; it returns a list of two integers, the quotient and the remainder resulting from the division. Check that the procedure gives the following results:
> (quot-and-rem 38 5) (7 3) > (quot-and-rem -110 12) (-9 -2) > (quot-and-rem 53 0) Error in quotient: undefined for 0. Type (debug) to enter the debugger.
No problem with the first two examples, but in the third one we indirectly asked for a division by zero and got an error message. For the third procedure call, the prompt debug> indicates that your next command will be read and processed by the Chez Scheme debugger rather than by the usual interactive interface. Normally, you will respond to the debug> prompt by typing one of three responses.
debug> i #<system continuation in error> :
It presents you with the not-very-enlightening statement that when the error was reported, Chez Scheme was executing a procedure named error and was about to continue by restarting the ``system'' -- generating a new prompt and waiting for more input. The colon near the right side of the window is the inspector's prompt; the inspector is waiting for instructions on what you'd like to see at this point.
The inspector arranges the information about the context of the error into frames, one frame for each procedure that had been invoked but had not yet completed its work when the error occurred. Various commands allow provide you with information about the frames:
| Command | Expanded Name | Meaning |
|---|---|---|
| depth | Depth | Tell how many frames there are |
| sf | Show Frame | Display list of all frames |
| s | Show | Display information on current frame |
| d | down | Move down one frame |
| u | up | Move up one frame |
| t | top | Move to top frame |
| b | bottom | Move to bottom frame |
| ? | help | Obtain list of commonly used commands |
| ?? | complete help | Obtain list of all commands |
| q | quit | Leave inspector |
| e | exit | Leave debugger |
For example, sf produces the following list:
0: #<system continuation in error> 1: #<system continuation in quotient> 2: #<continuation in quot-and-rem> 3: #<top level continuation>
At the ``top level,'' we typed in a call to quot-and-rem; that
procedure was invoked but never returned, because the error occurred before
the value was computed. The quot-and-rem procedure invoked
quotient to get the first item for the list; the
quotient procedure also never returned, because it couldn't
complete a division by zero. Instead, it invoked the error
procedure, which stored all this information away, printed out the
appropriate error message, and turned things back over to the Chez Scheme
interactive interface; technically, error hasn't returned yet
either.
Since the trouble seems to involve the quotient procedure, we
can examine frame down from ``system continuation in
error by using the d and s
commands. This gives the following information:
#<system continuation in quotient> : s continuation: #<continuation in quot-and-rem> free variables: 0: 0 1: 53 2: #<system procedure quotient>
The ``continuation'' entry identifies the frame to which the program
intended to return after completing the call to the procedure described in
the current frame. (It describes how the program would have
continued if the error had not occurred.) The list of ``free
variables'' gives you the values of the parameters for the procedure
invocation, starting at the right end: parameter #0, which is
divisor, had the value 0, and parameter #1, which is
dividend, had the value 53. The value listed here as #2 is
the procedure that was invoked. Remember that in Scheme a procedure is
also a datum stored in a variable; in this case it is a ``system
procedure'' (that is, a predefined procedure of Chez Scheme) stored in the
variable quotient.
quot-and-rem procedure. Try to interpret the resulting
information.
second which
returns the second element on a list:
(define second
(lambda (ls)
(cadr ls)))
Now consider the following interactions:
> (second '(a b c d))
b
> (second '(a))
Error in cadr: incorrect list structure (a).
Type (debug) to enter the debugger.
As you can see, this version of second relies on the
underlying implementation of Chez Scheme to detect and report such errors.
Sometimes this is an adequate way of dealing with them, but in other cases
one might prefer to write the procedure in such a way that it checks and
enforces its preconditions before performing any operations on its
arguments.
In Chez Scheme, this can be done by adding, at the beginning of the body of
the procedure, an if-expression that tests the precondition
and invokes a procedure named error if it is not met:
(define second
(lambda (ls)
(if (or (not (list? ls))
(null? ls)
(null? (cdr ls)))
(error 'second "second's argument must be a list of >= 2 elements")
(cadr ls))))
The second procedure tests whether its argument is a
list of at least two elements. The
definition of the second procedure is the same as the
one provided in the lab on recursion, except that the precondition is now
being tested: The error procedure is invoked if either the
incoming argument is not a list, is an empty list, or has only one element.
Test the revised second procedure in the following cases:
(second 5) (second 'a) (second '()) (second '(a)) (second '(a b)) (second '(a b c))In each case, describe the resulting output.
last which returns the final element on a
non-empty list. Include a precondition test that ensures that the
argument is a non-empty list.
Including precondition testing in your procedures often makes them markedly easier to analyze and debug, so I recommend the practice, especially during program development. There is a trade-off, however: It takes time to test the preconditions, and that time will be consumed on every invocation of the procedure. Since time is often a scarce resource, it makes sense to save it by skipping the test when you can prove that the precondition will be met. This often happens when you, as programmer, control the context in which the procedure is called as well as the body of the procedure itself.
For example, in writing your last, your code should have
tested the precondition when the procedure is invoked ``from outside'', as
this guards against careless or irresponsible calls of the
last procedure. However, it is a waste of time to repeat the
test of the precondition for any of the recursive calls to the
procedure. At the point of the recursive call, you already know that
ls is a non-empty list of at least two elements (because you
probably set up your code with that precondition in mind. Thus, it is
unnecessary to confirm this again at the beginning of the recursive call.
One solution to this problem is to replace the definition of
last with two separate procedures, a ``husk'' and a
``kernel.'' The husk interacts with the outside world, performs the
precondition test, and launches the recursion. The kernel is supposed to
be invoked only when the precondition can be proven true; its job is to
perform the main work of the original procedure, as efficiently as
possible:
(define last
(lambda (ls)
;; Make sure that LS is a non-empty list of strings.
(if (-- check precondition here --)
(error 'last "the argument must be a non-empty list"))
;; Find the last element on the list.
(last-kernel ls)))
(define last-kernel
(lambda (ls)
(-- the main body of the recursive program goes here --)))
In later labs, we'll see that there are a couple of ways to put the kernel back inside the husk without losing the efficiency gained by dividing the labor in this way.
Write a husk-and-kernel version of your last procedure.
The error procedure in Chez Scheme takes two arguments. The
first should always be the symbol that names the procedure inside which the
error has occurred -- in our example, the symbol
longest-on-list. The second should be a diagnostic -- a
string that states specifically what precondition failed.
When the error procedure is invoked, Chez Scheme interrupts
that computation in progress, stores it where the debugger and inspector
can get hold of it, and returns you to the top level with an error message.
The symbol and the diagnostic string that were given as arguments to
error appear in this message.
Chez Scheme provides another procedure that can be used like
error and takes the same two arguments: The
warning procedure prints out a warning message, but does not
interrupt the computation in progress; instead, Chez Scheme rushes on and
attempts to complete the job. (Often it eventually encounters an actual
error and gives up, but one can call warning regardless of
whether or not it presages an error.)
Write a procedure sum-of-list that takes any list of
numbers and returns the sum of of the elements of the list. Have the
procedure print a warning message if it is given an empty list. (The
procedure should return 0 after issuing this warning message.)
Define a predicate author? that takes one argument and
determines whether that argument is a list containing exactly three
arguments, the first one a string and the second and third ones integers.
(The predicate should return #t if all of these
conditions are met, #f if any one of them is not
satisfied.)
Notes
This document is available on the World Wide Web as
http://www.math.grin.edu/~walker/courses/153/lab-testing-debugging.html