Laboratory Exercises For Computer Science 151

String and File I/O Libraries

String and File I/O Libraries

Goals: This laboratory exercise describes some additional approaches to file processing and introduces the concept of libraries for string and/or file processing.


String Libraries

In the previous lab on reading files line-by-line, we identified several procedures which may be helpful in many circumstances. For handling strings, we developed:

When such generally useful procedures are identified and implemented, it is common to collect them into a file. That file then can be loaded into new applications as they arise, so the useful procedures can be reused without having to write then again. Such a file or a collection of files sometimes is called a procedure library.


  1. Create a new file string-lib.ss with your editor, and place copies of get-word and chop-word into the file. Then save the file, so these functions can be loaded and used in future applications.

  2. Check that your string-lib.ss file works correctly by starting up Scheme in a dtterm window, loading string-lib.ss, and trying the following tests:

    (get-word  " this is a test ")  ==> "this"
    (chop-word " this is a test ")  ==> " is a test ")
    (get-word  "    2.718281828459") ==> "2.718281828459"
    (chop-word "    2.718281828459") ==> ""
    


Reading Files

Before considering the beginnings of an analogous file library, we identify two more useful functions for reading:

Implementation of File-Reading Procedures:

Procedure read-n reads character-by-character for the correct number of characters. However, if #\newline is encountered before all of the designated characters are obtained, then the end of the string is padded with spaces. This is accomplished with the following code:
(define read-n
   (lambda (source number-chars)
      (let loop ((n number-chars) (char-list '()))
          (if (zero? n)
              (list->string (reverse char-list))
              (let ((next-char (read-char source)))
                  (if (char=? next-char #\newline)
                      (string-append (list->string (reverse char-list))
                                     (make-string n #\space))
                      (loop (- n 1) (cons next-char char-list))
                   )
              )
          )
      )
   )
)

To test this program, one might utilize the following simple test procedure:

(define test-read-n
   (lambda (file-name)
      (let ((source (open-input-file file-name)))
         (display "First 10 characters: ")
         (write (read-n source 10))
         (newline)
         (display "Next 16 characters: ")
         (write (read-n source 16))
         (newline)
         (display "Next 8 characters: ")
         (write (read-n source 8))
         (newline)
         (display "Next 40 characters: ")
         (write (read-n source 40))
         (newline)
         (close-input-port source)
      )
  )
)


  1. Run test-read-n on file "/home/walker/151s/labs/file4.dat" contains one line, with the characters abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ. Thus, the call

    (test-read-n "/home/walker/151s/labs/file4.dat")

    should return the following:

    First 10 characters: "abcdefghij"
    Next 16 characters: "klmnopqrstuvwxyz"
    Next 8 characters: "ABCDEFGH"
    Next 40 characters: "IJKLMNOPQRSTUVWXYZ                      "
    

    In this output, note that the last line reads the final newline character and then pads the string with space characters in order to obtain a string of the correct length.

    Check that the last line has the correct number of spaces at the end.

  2. Run test-read-n on another file that you have constructed, so you can check some additional cases.

  3. Implement procedure clear-line, as described above. Note that this code may follow the same general format of read-line from the previous lab, except that no characters have to be stored in the main loop.


A File Library

  1. Collect the file-reading procedures described thus far into a file library, file-lib.ss. Thus, file-lib.ss should contain:

    Note that all of these procedures assume that the source file already been opened in a separate main program.

  2. Run a few tests to be sure that file-lib.ss works correctly.


More Processing of Text Files

The previous lab analyzed the sixty largest cities and towns in Iowa to determine their percentage increase or decrease from 1980 to 1990. We now consider how to modify the previous program to determine the name of the Iowa city with the largest 1990 population (i.e, Des Moines).

General Approach:

Our general approach to find the largest city might follow the same idea we have followed many times to find a maximum number. A typical outline to find a maximum number in a file follows:

  1. Open the input file
  2. Read the first number -- this is the current maximum
  3. Until the end of the file is reached:
    1. Read the next number
    2. If the new number is larger than the past maximum
      1. update the new maximum
  4. Print the maximum value found
  5. Close the input file

The approach for finding the largest city is similar, although now the program must read the city name, the 1990 population, and the 1980 population for each city. (While the 1980 population is not needed, we still read it, so we can get to the next data in the file.) In what follows, we have modified the line-by-line processing so that we can remember the maximum identified so far.

(load "file-lib.ss")
(define find-max-city
  (lambda (source-file-name)
    (let* ((source (open-input-file source-file-name))  ; Open the input file.
           (first-city (read-n source 16))
           (first-pop-90 (read source))
           (first-pop-80 (read source))
           (first-remainder (clear-line source)))
      (let loop1 ((ch (peek-char source))      ; Peek at the next character.
                  (max-city first-city)
                  (max-90 first-pop-90))
        (if (eof-object? ch)                   ; If you get the eof-object,
            (begin
              (display "The largest city is ")
              (display max-city)
              (display " with a population of ")
              (display max-90)
              (newline)
              (close-input-port source))        ; close the input file
            (begin
              ;; process-line in the previous code is expanded here
              (let* ((next-city (read-n source 16))
                     (next-pop-90 (read source))
                     (next-pop-80 (read source))
                     (next-remainder (clear-line source)))
                 (if (< max-90 next-pop-90)
                     (loop1 (peek-char source)
                            next-city
                            next-pop-90)
                     (loop1 (peek-char source)
                            max-city
                            max-90)
                  )
               )
            )
        )
      )
    )
  )
)

  1. Test this program on the Iowa cities file:

    (find-max-city "/home/stone/courses/scheme/Iowa-cities.dat")
    
  2. Explain why two let* statements are needed in this procedure. Also, why is let* used rather than let?

  3. Does the procedure still work if the lines for first-pop-80 and next-pop-80 are deleted? Explain why or why not?

  4. Why does the output of this program leave a space after the city name?

  5. Modify find-max-city to find the city with the maximum percentage increase from 1980 to 1990.


This document is available on the World Wide Web as

http://www.math.grin.edu/~walker/courses/151.sp99/lab-file-library.html

created April 5, 1999
last revised April 9, 1999 by Henry M. Walker