Association Lists Summary: We consider association lists, a common technique for storing information, as well as the assoc procedure, which is used for searching through those lists.
Introduction Consider a simple problem that has been handled behind the scenes in MediaScript: Converting from color names, represented as strings, to the color values we are so accustomed to. While one approach would be to write a procedure with a very long cond expression, that could take a long time to write. As importantly, such a solution is not very adaptable. Another approach is to represent the information as if it were a phone book or a dictionary, with a long list of entries. Each entry would have a color name (e.g., "white"), the red, green, and blue values (e.g., 255, 255, and 255), and perhaps even some additional information, such as the names of common palettes to which the color belongs. You might come up with something like the following: Color R G B Additional Information --------- --- --- --- ----------------------------- red 255 0 0 primary,rainbow,reds,web-safe green 0 255 0 primary,rainbow,web-safe yellow 255 255 0 secondary,rainbow blood-red 102 0 0 reds,web-safe bronze 140 120 83 metallic When computer scientists represent information in this way, they tend to refer to the information as a table. Often, we call a group of tables a database. Once information is grouped into a table, we can write procedures to select information, such as all of the rainbow colors.
Representing Database Tables A great deal of research has gone into ways to represent database tables efficiently, so that the queries we make to select elements or create new tables are quick. These structures are beyond the scope of this course. For now, we will choose a simple and understandable representation that builds upon the core Scheme structures. In particular, to to represent each of these individual entries, we can use a list, such as ("red" 255 0 0 "primary" "rainbow" "reds" "web-safe") or ("bronze" 140 120 83 "metallic"). In each case, the name of the color is the car of the entry, the red, green, and blue components are the cadr, caddr, and cadddr of the entry, respectively. An entire color description, then, would be a list of such entries. (We're leaving out the other attributes, for now, because they may complicate things.) While this is a simple representation, it is frequently used by Scheme programmers who want to describe tables. There's even a name for this arrangement: In Scheme, a list of pairs or lists in which we use the value of the car to identify an entry is called an association list or alist. As the color directory example suggests, a common application of association lists involves looking for a desired name or first component of an entry and then returning another part of the entry (or something computed from other parts of the entry). For example, we are likely to search the color directory for color names and then build a new color based on the three components. Because the first entry has a special role, we often refer to it as the key of the entry. We refer to the rest of the entry as the associated value. In the color list above, the color names are the keys (e.g., "black", "blood red", and "blue") and the RGB components and attributes are the associated values.
<code>assoc</code>, Scheme's Built-in Lookup Procedure Since applications in which we need to extract data from tables are very common, Scheme provides procedures to retrieve from an association list the entry containing a specified key. The most frequently used procedure of this kind is assoc. Given a key and association list, assoc returns the first entry with the given key. If the key does not occur in the association list, then assoc returns #f. For example,
Extracting Information Once we've used assoc to find an entry, what do we do next? Most typically, we then do something with the rest of the entry. To continue our color table example, we might want to convert the three components to an RGB color. So, what do we do? We use assoc to look up an entry with the given color name. If we find the entry, we then extract the red, green, and blue components and turn them into an RGB color. If we don't find the entry, we return some default value (say rgb-transparent). In code, we write For example, Note that the result depends on the directory. For example, Some of you may recall asking why if might take a value other than #t or #f as a parameter. This procedure is one example of why.
Looking Up a Color, Revisited One problem with the lookup-color-by-name procedure given above is that it calls assoc four times with identical parameters, once to determine whether the color name is in the list and once to determine each of the three components. Rather than calling assoc four times, we might call it once, name that result, and use it each time.
Building More Complex Entries If you recall our initial plans for the color table, we wanted to store more than just a color name and its components. We also thought it would be useful to store attributes of the color, such as whether or not it is Web safe. Hence, we might update each entry to include that additional information, which we might represent as strings. You should note a few things about this revised list. First, we've left the red, green, and blue components in exactly the same place, so that lookup-color-by-name still works correctly. Second, we've made the association list easier to read by lining up corresponding values, as in a table. We're taking advantage of the fact that Scheme ignores extra spaces.
Implementing <code>assoc</code> As is the case with many of the built-in Scheme procedures, assoc is relatively easy to implement ourselves. In essence, assoc is much like other algorithms we've written to look for a value in a list. That is, assoc recursively steps through the list until it finds a match or runs out of elements.
Using Other Keys The assoc procedure works fine if the key is the first element of a data item. But what if it's the second (or third, or fourth, or ...). For example, what if we want a color whose red component is 255 or 102? Then we can't rely on assoc, because it only looks at the first element of each list. Instead, we need to write our own procedure, using the same structure as above. For example, to find a color with a particular red component, we might write It might even make sense to generalize this procedure, so that we can look up by any of the three components. We can now define lookup-by-red (and even lookup-by-green and lookup-by-blue) in terms of lookup-by-component. (define lookup-by-red (lambda (red ctable) (lookup-by-component red 1 ctable))) (define lookup-by-green (lambda (green ctable) (lookup-by-component green 2 ctable))) (define lookup-by-blue (lambda (blue ctable) (lookup-by-component blue 3 ctable))) What if we want to return all the colors with a particular component, or permit colors where the component is close enough to the desired component? We'll explore that issue in lab.