CSC 213, Fall 2006 : Schedule : Lab 8

Lab 8: An Expanded UNIX-style Shell

Goals:

Background: This lab draws upon three sources:

In particular, this lab is inspired by Lab 9.2, which discusses strategies for implementing concurrent processes and I/O redirection in a UNIX-style shell.

Collaboration: You will complete this lab in teams of 2 or 3 as assigned by the instructor.  (These teams are the same as for Lab 2.) You may, as always, consult with your classmates on issues of design and debugging.


Assignment

Lab Exercise 2 for this course asked you to write a simple Unix-style shell according to the following outline:

  1. Write a program that reads successive command lines from a terminal window,
  2. For each command line,
    1. break the command line into tokens - the pieces separated by spaces
    2. places the command tokens into an array of command-line strings
    3. identifies the location of the desired program by searching the user's PATH variable for the given program
    4. uses fork to spawn a child process, and using execv within the child process to actually run the desired program.

In this lab, you are to extend the shell in four ways:

  1. Implement the "&" modifier so that if the last character on the command line is "&", the program is executed concurrently with the shell, rather than the shell waiting for it to complete.
  2. Allow the standard input or output to be redirected from or to a file using the "<" and ">" symbols.
  3. Allow the standard output from one program to be redirected into the standard input of another program using the "|" symbol.
  4. Finally, modify your shell so that it can simultaneously support "<", "|", ">", and "&" in a single command line.

Examples

Input Redirection

Within the Unix/Linux command-line environment, programs normally read from "standard in" and write to "standard out". By default, "standard in" is usually the keyboard, and "standard out" is usually a computer monitor at a person's workstation. For example, consider program max-min.c. This program reads an integer n, followed by n real numbers, and finds the maximum, minimum, and average of the real numbers. All reading is done from "standard in" and output is to "standard out". The program also prints out the ith real number. Thus, a typical run might be:

%gcc -o max-min max-min.c
%./max-min
Program to process real numbers.
Enter number of reals: 7
Enter 7 numbers: 3.0 1.0 4.0 1.0 5.0 9.0 2.0
Maximum: 9.00
Minimum: 1.00
Average: 3.57

Enter the index (1..n) of the number to be printed: 6
The 6 th number is 9.00

In running the program, Unix allows data to be read from a file, rather than from standard in. For example, suppose that a file data contains the following entries, which repeat exactly what the user typed in the above session:

7
3.0 1.0 4.0 1.0 5.0 9.0 2.0
6

Unix allows information to be read from the file data rather than from the keyboard, with the following command:

%./max-min < data

In this command, the less than sign (<) indicates that the name that follows (data) should be used for input rather than the keyboard. The output is exactly as above, except that the user's typing is not seen (it came from the file).

In this example, the output looks a bit strange, as the prompts for data are still printed to the screen, even though the input from the file is not echoed. When redirecting input, it is sometimes advised not to prompt the user, although that is more a matter of form than a technical requirement.

Output Redirection

Similarly, all output could be written to a file rather than to a monitor. Thus, to print output in a file called results, we might use the following command:

%./max-min > results

In this case, the program will wait for you to enter the relevant data, but no prompts appear on your screen. Rather, all output goes to the file results.

Input and Output Redirection

Redirection of both input and output can be combined on a single Unix command line:

%./max-min < data > results

In this context, nothing appears on the screen either from the user typing input or the program printing output.

Implementing I/O Redirection

Nutt discusses I/O redirection in Lab Exercise 9.2 using open and dup.

Alternatively, an implementation of input or output redirection paralleling the use of pipes may follow 3 main steps:

  1. Use open to set up the file, giving an integer file descriptor as a result. (Use the parameter O_RDONLY for reading, or the parameter O_WRONLY | O_CREAT for writing.)
  2. Use dup2 to copy the file descriptor to STDIN_FILENO or STDOUT_FILENO.
  3. Close the file variable opened in step 1, since standard in or out will handle the relevant I/O tasks.

Piping within a Unix command line

Within a Unix command line, one can designate the output of one program as the input of another.  For example, the command "ps -u <username>" will list all processes belonging to the named user:

% ps -u davisjan
  PID TTY          TIME CMD
19709 ?        00:00:00 artsd
23580 ?        00:00:00 x-session-manag
23639 ?        00:00:00 ssh-agent
23642 ?        00:00:00 dbus-daemon
23643 ?        00:00:00 dbus-launch
23645 ?        00:00:01 gconfd-2
23648 ?        00:00:00 gnome-keyring-d
23650 ?        00:00:00 bonobo-activati
23652 ?        00:00:01 gnome-settings-
23654 ?        00:01:17 metacity
23663 ?        00:00:02 gnome-panel
23665 ?        00:00:03 nautilus
23669 ?        00:00:30 gam_server
23673 ?        00:00:00 update-notifier
23675 ?        00:00:26 gnome-cups-icon
23679 ?        00:00:00 gnome-volume-ma
23691 ?        00:00:00 gnome-vfs-daemo
23707 ?        00:00:03 wnck-applet
23709 ?        00:00:00 mapping-daemon
23711 ?        00:00:00 clock-applet
23713 ?        00:00:00 multiload-apple
23762 ?        00:00:01 gnome-screensav
 7508 ?        00:00:02 gnome-terminal
 7509 ?        00:00:00 gnome-pty-helpe
21170 pts/1    00:00:00 tcsh
21184 ?        00:00:04 epiphany
21472 ?        00:00:00 run-mozilla.sh
21478 ?        00:01:12 nvu-bin
22791 pts/2    00:00:00 tcsh
24313 ?        00:00:03 mred
24348 pts/1    00:00:00 ps

Perhaps I wish to see only those applications that have gnome in their name.  One approach would be to use the grep program to filter the results from ps.  In this approach, we want to generate the full listing, send the output through a pipe to grep, filter the process descriptions, and print the results:

% ps -u davisjan | grep gnome
23648 ?        00:00:00 gnome-keyring-d
23652 ?        00:00:01 gnome-settings-
23663 ?        00:00:02 gnome-panel
23675 ?        00:00:26 gnome-cups-icon
23679 ?        00:00:00 gnome-volume-ma
23691 ?        00:00:00 gnome-vfs-daemo
23762 ?        00:00:01 gnome-screensav
 7508 ?        00:00:02 gnome-terminal
 7509 ?        00:00:00 gnome-pty-helpe

In this command, the vertical line "|" indicates the output of program should be sent to the next program.

As an additional step, we might want to exclude all processes that have run for a negligible amount of time, using grep -v '00:00:00'. Adding another pipe to the previous command to include this step yields

% ps -u davisjan | grep gnome | grep -v '00:00:00'
23652 ? 00:00:01 gnome-settings-
23663 ? 00:00:02 gnome-panel
23675 ? 00:00:26 gnome-cups-icon
23762 ? 00:00:01 gnome-screensav
7508 ? 00:00:03 gnome-terminal

Implementation of such pipes follows closely the programs fork-2.c through fork-6.c in An Introduction to Concurrency in Unix-based [GNU] C Through Annotated Examples.

Combining concurrency, I/O redirection, and pipes

Note that we can use all of these features in a single command line. For example, here is a command line that will use grep to find lines that represent includes of system header files in C program named bounded-buffer-4.c, use wc to obtain word count statistics, use awk to extract the line count from these statistics, and store the result in a file named sys_includes---all this concurrently with the shell.

% grep #include < bounded-buffer-4.c | grep sys | wc | awk '{ print $1 }' > sys_includes &
[1] 17297 17298 17299 17300
You have new mail.
% cat sys_includes
5
[1]  + Done                          grep #include < bounded-buffer-4.c | grep sys | wc | awk { print $1 } > sys_includes
%

To understand how this works, try looking at the results of partial commands, e.g., "grep #include < bounded-buffer-4.c | grep sys". 
If you use the bash shell, also try out this very neat example forwarded to me by John Stone.

Note that input redirection applies to only the first command, and output redirection applies to only the last command.  But, there can be an arbitrary number of pipes in between.


Work to be Turned In

Part A, Due Friday, 27 October: 

Part B, Due Monday, 30 October:

If you wish, you may turn in both parts A and B on Friday, 27 October.

As with all labs, you should turn in your solution utilizing the course's format for submitting assignments.


Janet Davis (davisjan@cs.grinnell.edu)

Created October 23, 2006 based on http://www.cs.grinnell.edu/~walker/courses/213.fa04/lab-shell-refined.shtml
Last revised October 23, 2006
With thanks to Henry Walker