| CSC 153 | Grinnell College | Spring, 2009 |
| Computer Science Fundamentals | ||
This laboratory exercise introduces the concept of expert systems in general and a specific expert system in particular. Further, since such programs often utilize file input, this application provides an example of the need to convert data generated by one program into a different form for use in another program.
Every college and university must decide how to place incoming students in appropriate courses. In many cases, placement recommendations may be based on standardized tests (SATs, ACTs, Achievement Tests, Advanced Placement (AP) scores, International Baccalaureate (IB) scores) and high school transcripts. Historically at Grinnell, faculty reviewed such data manually and made recommendations.
Although faculty continue to be actively involved, the Departments of Computer Science and Mathematics/Statistics now use an expert system to automate much of this process. The first version of the current program was developed in the Spring, 1993, as a student-faculty project; students Vikram Subramaniam and Ivan Sykes worked with faculty member Henry Walker. The program has been refined and expanded several times over the years to reflect new or revised courses. Also, follow-up studies of student success in their first computer science and mathematics courses have motivated adjustments to rules and have provided reasonable validation to the system.
This lab explores this placement program and identifies some principles that extend more generally to the construction of expert systems.
The main placement program can be used in several ways:
The Web interface is simplest.
Try running this program through a Web browser, using the address
http://www.cs.grinnell.edu/~walker/placement/placement-form.html
Try running the program through the browser again, this time entering some impossible values (e.g., a negative number for a number of semesters, SATs over 800, etc.) Describe how the program responds to these values?
Expert systems, including the placement program, have two general parts:
For the placement program, both the rule base and inference engine are written in LISP, a programming language that is quite similar to Scheme. We first examine the rule base for the placement program.
The overall approach is to start with information from standardized scores and the student's high school transcript. Rules then make inferences about the standardized scores, possible interim or temporary placements, and a final placement. The following diagram shows the basic process for math placements.
Open the file /home/walker/placement/newstudents.lsp within an editor, so that you can review the rules easily.
Also, load the rules and program into LISP within a terminal window:
lisp (load "/home/walker/placement/newstudents.lsp")
Although the details of this rule base are quite sophisticated, the general organization is reasonably straightforward. The first part of newstudents.lsp defines the variables for placement, and the rest of that file contains numerous rules. The file newstudents.lsp also loads the inference engine tmycin.lsp, a program based on TMYCIN, written by Gordon Novak at the Artificial Intelligence Laboratory of the University of Texas at Austin.
As you first review the rule base, a few comments may provide some orientation:
We now examine the rules somewhat more closely.
Review rule125, the first rule that gives a preliminary placement of a student within Math 133.
(showrule 'rule125) (englrule 'rule125)
in the terminal window running LISP and newstudents.lsp
Write a sentence or two explaining this rule.
Write a few sentences that describe rules 156 and 190. Why do you think rule 190 is needed for final placement? Why not just use temporary or preliminary placements TPLACE?
Run the placement process with some impossible values and describe how the program responds.
The TMYCIN inference engine applies the rules for the data from a student, and also allows us to obtain some justifications for the placement.
If it is not already running, start LISP and load /home/walker/placement/newstudents.lsp within a terminal window, following the commands in step 2.
Now run the placement option, by typing
(placement)
When the menu appears, use the "I" option to run the program interactively. Then, enter data for yourself or for a hypothetical student as requested.
When you are done, type "Q" to quit the menu. You now should be back at the prompt within LISP.
Within the terminal window, type
(why)
This will give you the rule for the final placement. Now trace the logic backward. For example, if (why) gives a final placement of 215, expand the why query:
(why tplace 215)
TMYCIN also allows us to ask why another conclusion was not used. Continuing the 215 example:
(whynot 133) (whynot tplace 133)
Based on this inquiry, write a paragraph on what rules were used in obtaining the placement from your example.
The Tmycin system assigns a student number for each record that is entered. In your interaction using why, the system also prints this identifier (e.g., student1540). Determine all the information derived for this student with the showprops command:
(showprops 'student1540)
With each property, the system maintains a confidence factor, on a scale of -1 (confident the property is false) to 0 (no confidence whatsoever) to 1 (confident the property is true). Review the properties for this placement to determine how confident the system was in reaching its conclusion.
Run (placement) to find and analyze placements for two additional student records. In each case, write a paragraph describing the placement and outlining how that placement was obtained.
To exit LISP, type
(quit)
at the LISP prompt
Earlier in this lab, you entered some unlikely or incorrect values and noted how the program responded. This part of the lab considers the question of how programs might handle such erroneous input.
When programs are to be run by general users, the programs usually should check that user input is reasonable and consistent. Typically, input errors fall into three categories:
In some situations, we can determine that a particular input value could not possible be correct. For example, SAT scores must be between 200 and 800. In such cases, programs could check that whether an input value is possible. If not, the program could either ignore the data or ask the user to re-enter the value.
Run the interactive version of the placement expert system, and enter some scores that are impossible. How does the program behave when just one or just a few data values are impossible? In this test, include both integer values that are out of range and a misspelling of "Unknown".
Repeat the previous step using the Web-based interface to the placement expert system.
In other applications, we may want a user to double-check an input value before the program continues. We might expect values of a certain type or range, but we cannot rule out the possibility of other values. For example, in a grocery store, we expect most individual items to be between $0.00 and $100.00 (exclusive). Items over $100.00 are very unlikely although they are possible. (E.g., one side of beef from the meat department could exceed this amount. )
In such cases, a program might review an input value to see if it is unlikely to be correct, and the program may ask the user to verify the correctness of the datum. For example, if a grocery item costs $150.00, the programmer might question if the cashier has entered an extra 0, and the program might give the opportunity to correct this value. Since this value is possible, however, the program should allow the value to be processed.
What happens when such values are entered into the Web-based placement interface?
The job of verifying input is particularly difficult when the values seem plausible. For example, if an SAT score of 650 is entered as 560, a program will have difficulty identifying the error.
In programming, one approach to such data entry errors is to ignore them, as they may be too costly or time consuming to find them. A second approach involves supplying the user with a printout of all data entered, so the user can check the values manually. (The paper receipt at the grocery store allows this type of checking.)
How often have you reviewed the paper receipt you obtained from a grocery store? Discuss your answer briefly.
The Web-based program displays both the potential placement and the data entered by the user. To what extent does this seem effective within this Web application in encouraging a user to check the data entered?
This document is available on the World Wide Web as
http://www.cs.grinnell.edu/~walker/courses/153.sp09/labs/lab-placement.shtml
|
created 8 January 1998 by John David Stone last revised 9 April 2008 by Henry M. Walker |
|
| For more information, please contact Henry M. Walker at walker@cs.grinnell.edu. |