CSC 201 Grinnell College Spring, 2005
 
Data Representation, Memory Management, and Formal Methods
 

Files in C

Goals

This laboratory provides experience retrieving data from text files using C.

Background: a File of City Data

The file ~walker/c/files/city.dat contains some historical data regarding several large American cities. More specifically, in city.dat, each entry consists of the name of the city (line 1), the county or counties (line 2) and the state (line 3) in which it is situated, the year in which it was incorporated (line 4), its population as determined by the census of 1980 (line 5), its area in square kilometers (line 6), an estimate of the number of telephones in the city (line 7), and the number of radio stations (line 8) and television stations (line 9) serving the city. Thus a typical entry reads as follows:

   Albuquerque
   Bernalillo
   New Mexico
   1891
   331767
   247
   323935
   14
   5

A blank line follows each entry, including the last.

  1. Review this data file, since the following discussion is based on the format and content of this material.

This file illustrates a common format for data files: Information within the file is organized by lines, material on related topics is grouped together within a block of lines, each line or section of a line has a specified meaning within a block, and the file may be considered as text that could be edited by a standard editor, such as emacs or vi.

Typical File Processing

The following illustrate typical types of questions for such data files:

  1. How can I get a list of all of these cities and the information that goes with them?
  2. Which of these cities listed in the file were incorporated more than 150 years ago?
  3. Which of these cities has the lowest per capita number of radio and TV stations?

Processing Data in Files

Program city-file.c shows a typical organization for addressing such questions. Program city-file-alt.c is a slighly more streamlined program that performs the same tasks.

Of course, details of reading and printing depend greatly upon the application, but the basic approach illustrated here works in many cases.

  1. Before going on to the rest of this lab, review /home/walker/c/files/city-file.c and /home/walker/c/files/city-file-alt.c, to be sure you understand how this program works.

Median Family Income

File ~walker/c/files/state-income.dat contains information about the median annual income for a 4-person family for the various states for the years 1997 back to 1979. As shown in the file sample that follows, the first five lines contain header information (2 lines of title, a blank line, column headings for various years, and another blank line). Thereafter, the information about each state is on a separate line. Within a line, the state name is left justified in the first 21 characters, and income figures are in 6-character-wide columns (the income appears as 5 characters, and a blank spaces separates one year's income figure from the next).


                         Median Income for 4-Person Families, by State, According to the U.S. Census Bureau
                        Reported November 3, 1999 on Web site http://www.census.gov/hhes/income/4person.html

Year                  1997  1996  1995  1994  1993  1992  1991  1990  1989  1988  1987  1986  1985  1984  1983  1982  1981  1980  1979

United States        53350 51518 49687 47012 45161 44615 43056 41451 40763 39051 36812 34716 32777 31097 29184 27619 26274 24332 22395
Alabama              48240 44879 42617 41730 37975 39659 37638 35937 34930 33022 31221 29799 28407 26595 25117 24181 22443 22026 18613
Alaska               57474 62078 56045 53555 51181 49632 49721 51538 48411 47247 47106 41292 42897 44017 38238 31823 35834 32745 31037
Arizona              47133 45032 44526 41599 39679 39900 39364 38799 38347 36892 35711 33477 32129 29431 27551 29835 25163 23832 23000
Arkansas             38646 36828 38520 36510 32594 36682 34566 31913 31853 28665 27415 27157 26255 23075 21524 20710 20583 19448 18493
California           55217 53807 51519 48755 44643 46774 46643 45184 42813 41425 40218 37655 36223 33711 31967 29885 27763 26070 25109
Colorado             58988 53632 50941 48801 47112 45021 43136 41803 40265 39095 37778 36026 35214 34154 32294 30663 28756 25943 25228
Connecticut          72706 67380 62157 62107 59288 55061 54479 53931 53313 50720 47195 44330 40677 39070 37703 35361 31108 28376 24410

Use the information in this file to perform the following processing:

    Extract from this file the name of each state and its median income for 1995. Put the results in a new file, called state-income-for-1995

    That is, after processing the above file, the new file state-income-for-1995 should begin:

    State          1995 Median Income
    
    United States        49687
    Alabama              42617
    Alaska               56045
    Arizona              44526
    Arkansas             38520
    California           51519
    
  1. From the original state-income file, extract each state's median income for a given year.

    That is, the program (or program segment) should ask a user for a year between 1979 and 1997 (inclusive) and print to the screen a listing of state names and their median incomes for that year.

  2. From the original state-income file, print to the screen the names of those states for which the median income decreased from some year to the next.

Work to turn in:

Your programming for steps 3, 4, and 5 may be done in a single program or in three separate programs.

As with other programs, your work should follow the course's specified format for submitting assignments. In this case, your script file should contain a listing of the file state-income-for-1995 that is generated for step 3.


This document is available on the World Wide Web as

     http://www.cs.grinnell.edu/~walker/courses/201.sp05/lab-files.html

created 27 September 2001
last revised 2 May 2005
Valid HTML 4.01! Valid CSS!
For more information, please contact Henry M. Walker at walker@cs.grinnell.edu.