Expert Systems
Expert Systems
Goals: This laboratory exercise introduces the concept expert
systems in general and a specific expert system in particular. Further,
since such programs often utilize file input, this application provides an
example of the need to convert data generated by one program into a
different form for use in another program.
Background:
Every college and university must decide how to place
incoming students in appropriate courses. In many cases, placement
recommendations may be based on standardized tests (ACTs, ACTs, Achievement
Tests, Advanced Placement (AP) scores, International Baccalaureate (IB)
scores) and high school transcripts. Historically at Grinnell, faculty
reviewed such data manually and made recommendations.
While faculty continue to be actively involved, the Mathematics and
Computer Science now uses an expert system to automate much of this
process. The first version of the current program was developed in the
Spring, 1993, as a student-faculty project; students Vikram Subramaniam and
Ivan Sykes worked with faculty member Henry Walker. The program has been
refined and expanded several times over the years to reflect new or revised
courses.
Overview of the Placement Program:
The main program can be used in several ways:
-
users can run the program interactively at a terminal window with the user
entering needed data,
-
the program can process information received electronically from the
Registrar, producing letters for students, or
-
users can use a World-Wide-Web interface to enter data and view placement
conclusions.
The Web interface is simplest.
-
Try running this program through Netscape, using the address
http://www.math.grin.edu/~walker/placement/placement-form.html
While the placement program is written in LISP, the code and run-time
environment are quite similar to Scheme. To run the program in the
relatively familiar dtterm environment, follow these steps:
-
Run Allegro Common Lisp in a dtterm by typing the command
acl
-
At the new prompt, type
(load "~walker/placement/newstudents.lsp")
(placement)
-
When the menu appears, use the "I" option to run the program interactively.
Then, enter data for yourself or for a hypothetical student as requested.
-
When you are done, type "Q" to quit the menu. Then at the "USER" prompt,
type
(exit)
This application illustrates the structure of many expert systems. This
system has two basic parts:
-
A rule base, found in file "~walker/placement/newstudents.lsp", which
contains about 90 rules concerning the placement of students.
-
An inference engine, found in file "~walker/placement/tmycin.lsp", which
applies the rules to specific cases.
Each part is written in LISP, the most common computer language for
artificial intelligence. The inference engine is based on the program
TMYCIN, written by Gordon Novak at the Artificial Intelligence Laboratory
of the University of Texas at Austin.
-
To view the rules in a dtterm window, type
more ~walker/placement/newstudents.lsp
This command displays the first screen of the rule base. Hit the space bar
to view subsequent screens.
While the programming details require considerable sophistication in LISP,
you will notice that many elements of the code are reasonably similar to
Scheme -- following a list-oriented format. The general organization of
the rule base is reasonably straightforward:
-
After loading the inference engine ("/users/walker/placement/tmycin.lsp"),
the main variables (i.e., the context) for placement is given.
-
Semesters of English and high school credits are used to infer how many
semesters of high school work are recorded on the current transcript.
-
Standardized test scores are classified high, good, fair, poor, and low.
-
Preliminary placements are made for mathematics and then for computer
science by considering minimum cutoff levels. (Note, as the rules specify
minimums, someone prepared for calculus II also will be adequately prepared
for calculus I.)
-
Final placements are determined as the highest tentative placement.
A certainty factor is associated with each rule, based on a 1000 point
scale. A factor of 1000 indicates great confidence with the conclusion,
while 700 or 500 shows must less certainty in the result.
Use your knowledge of Scheme/LISP lists to interpret the likely syntax of
rules in this system.
Files for the Placement Program:
When this system is used each year to make recommendations to incoming
students, the Registrar's Office send the Department of Mathematics and
Computer Science a file of student transcript information. A fictional
file of this type is available in
~walker/261/labs/student.raw-data.
-
Examine this data file to determine the type and format of information
contained.
Since the expert system is written in LISP, which is list-oriented, the
placement program takes input from a file, in which data for a student is
in a separate list. For example, a typical entry might be:
("Person I M A" ( (ACT 28) (SemOfEng 8)(SemOfMath 8)
(Grades 2.75)(SemOfPCalc 1)(PCalcGrades 3.00)
(SemOfCS 2)(CSGrades 4.00) (TSemUnits 36) )
(Campus-Box |20-01| ) (Adviser "My Friend") )
The full file for the above fictional students is available in
~walker/261/labs/student.data.
Here, the person's name is the first entry in the list. The second list
component is a list of attributes. The person's mailbox and advisor, when
known, are the remaining elements of the main list.
Within the fictional student file, it is worthwhile to note several
characteristics which add complexity to this placement problem:
-
I.M.A.Person took only the ACT's, while Mickey Mouse took only the SAT's.
-
Popeye the Sailor took both tests, and those test scores are consistent,
while Tweety Bird got extremely different types of scores.
-
While most transcripts show four years of high school, Tweety Bird's shows
only three years and Popeye the Sailor's shows three and a half years. For
Tweety Bird, either the school is a three-year high school or the fourth
year has not been reported. For Popeye the Sailor, the final semester
probably has not yet be reported.
Such variations are common in applications involving expert systems.
Note that the list of attributes is a list of pairs, where each entry
(e.g., (ACT 28)) gives the type of information (e.g.,
ACT first, followed by the data. Such a list of pairs is
called an association list, and such lists are a particularly common
mechanism for storing data within expert systems. We will see more about
processing data within an assocation lists in a later lab.
This example of files illustrates a common circumstance in computing, where
data from one application (e.g., the Registrar's file) comes in one format,
while data for another application (e.g., the placement program) requires a
second format.
-
Challenge Problem: Write a program which reads a data file in the
Registrar's format and produces a corresponding file in the list-oriented
format.
This document is available on the World Wide Web as
http://www.math.grin.edu/~walker/courses/153/lab-placement.html
created January 8, 1998
last revised November 1, 1998