Reading: Design patterns and higher-order procedures

Read By
Friday, Nov 9, 2018
Summary
In this reading, we explore the so-called higher-order procedures, procedures that take procedures as parameters or return procedures as results. In effect we are considering what happens when we allow procedures to serve as values.

Background: Design patterns

One mark of successful programmers is that they identify and remember common techniques for solving problems. Such abstractions of common structures for solving problems are often called patterns or design patterns. You should already have begun to identify some patterns. For example, you know that procedures almost always have the form

(define procname
  (lambda (parameters)
    body))

You may also have a pattern in mind for the typical recursive procedure over lists:

(define procname
  (lambda (lst)
    (if (null? lst) 
        base-case
        (do-something (car lst) (procname (cdr lst))))))

In some languages, these patterns are simply guides to programmers as they design new solutions. In other languages, such as Scheme, you can often encapsulate a design pattern in a separate procedure.

A simple pattern: Apply a procedure to each value in a list

Let’s begin with a simple problem: Suppose we have a list of zip code entries (you remember those, don’t you?) and we just want the city names from that list. Suppose, also, that your first inclination is to use recursion to create that new list. Perhaps you’ve forgotten about map. After some reflection, you might write something like the following.

;;; Procedure:
;;;   cities
;;; Parameters:
;;;   zip-data, a list of zip code entries
;;; Purpose:
;;;   Extract the cities from zip-data
;;; Produces:
;;;   list-of-cities, a list of strings
;;; Preconditions:
;;;   Each element of zip-data is a list of the form '(zip:string
;;;   latitude:real longitude:real city:string state:string county:string).
;;; Postconditions:
;;;   The ith element of list-of-cities is the city name of the ith
;;;   element of zip-data.
(define cities
  (lambda (zip-data)
    ; If there's nothing left in the list
    (if (null? zip-data)
        ; There are no cities
        null
        ; Otherwise, grab the first city, grab the remaining cities
        ; and tie it all together.
        (cons (list-ref (car zip-data) 3)
              (cities (cdr zip-data))))))
> (define all-zips (read-csv-file "/home/rebelsky/Desktop/us-zip-codes.csv"))
> (define useful-zips (filter (lambda (entry) 
                                (and (real? (cadr entry))
                                     (real? (caddr entry))))
                              all-zips))
> (define sample-zips (filter (lambda (entry)
                                (equal? "123" (substring (car entry) 2)))
                              useful-zips))
> (length sample-zips)
58
> (take sample-zips 3)
'(("02123" 42.338947 -70.919635 "Boston" "MA" "Suffolk")
  ("04123" 44.408078 -70.470703 "Portland" "ME" "Cumberland")
  ("06123" 41.791776 -72.718832 "Hartford" "CT" "Hartford"))
> (drop sample-zips 53)
'(("94123" 37.79967 -122.435732 "San Francisco" "CA" "San Francisco")
  ("95123" 37.189396 -121.705327 "San Jose" "CA" "Santa Clara")
  ("96123" 40.776154 -120.326259 "Ravendale" "CA" "Lassen")
  ("97123" 45.458397 -122.977963 "Hillsboro" "OR" "Washington")
  ("99123" 47.913065 -119.042562 "Electric City" "WA" "Grant"))
> (define sample-cities (cities sample-zips))
> (length sample-cities)
58
> (take sample-cities 5)
'("Boston" "Portland" "Hartford" "New York" "Nassau")
> (drop sample-cities 53)
'("San Francisco" "San Jose" "Ravendale" "Hillsboro" "Electric City")

Next, suppose we decide we instead want a list of the longitude/latitude pairs for each city. Our procedure will be quite similar.

;;; Procedure:
;;;   locations
;;; Parameters:
;;;   zip-data, a list of zip code entries
;;; Purpose:
;;;   Extract the locations from zip-data
;;; Produces:
;;;   list-of-locs, a list of lists
;;; Preconditions:
;;;   Each element of zip-data is a list of the form '(zip:string
;;;   latitude:real longitude:real city:string state:string county:string).
;;; Postconditions:
;;;   * Each element of list-of-locs is a two-element list containing
;;;     a longitude and a latitude.
;;;   * For every i, the ith element of list-of-locs contains the
;;;     values from the ith element of zip-data
(define locations
  (lambda (zip-data)
    ; If there's nothing left
    (if (null? zip-data)
        ; There are no locations
        null
        ; Otherwise, grab the first longitude/latitude pair,
        ; grab the remaining pairs, and join them together.
        (cons (list (caddr (car zip-data)) (cadr (car zip-data)))
              (locations (cdr zip-data))))))
> (define sample-locations (locations sample-zips))
> (length sample-locations)
58
> (drop sample-locations 53)
'((-122.435732 37.79967)
  (-121.705327 37.189396)
  (-120.326259 40.776154)
  (-122.977963 45.458397)
  (-119.042562 47.913065))

What do these two procedures have in common? Most of the documentation, not only the six P’s, but also the internal documentation. All return null when given the null list. More importantly, all three do something to the car of the list, recurse on the cdr of the list, and then cons the two results together.

Hence, it is natural to design a general pattern for apply a procedure to every value in a list.

(define PROCNAME
  (lambda (lst)
    (if (null? lst)
        null
        (cons (TRANSFORM (car lst)) 
              (PROCNAME (cdr lst))))))

We get the first procedure by substituting something that takes element 3 from the zip code entry. We get the second by substituting something that takes elements 2 and 1 and combines them into a list.

Now, here’s the cool part. We can also turn this pattern into a procedure. We just need to make transform a parameter.

;;; Procedure:
;;;   apply-to-each
;;; Parameters:
;;;   lst, a list of values
;;;   transform, a unary procedure
;;; Purpose:
;;;   Applies transform to each value in the list.
;;; Returns:
;;;   transformed, a list of values.
;;; Preconditions:
;;;   transform takes one value as a parameter and returns a value. [Unverified]
;;; Postconditions:
;;;   (length transformed) = (length lst)
;;;   For each i, 0 <= i < (length transformed)
;;;    (list-ref transformed i) = (transform (list-ref lst i))
(define apply-to-each
  (lambda (lst transform)
    ; If no elements remain, we have nothing to transform 
    ; so stick with the empty list.
    (if (null? lst) null
        ; Otherwise, transform the first value, transform the remaining
        ; values, and put the stuff back together into a list.
        (cons (transform (car lst))
              (apply-to-each (cdr lst) transform)))))

If we plug in procedures (either previously-named procedures or newly named procedures), we get the results we expect.

> (apply-to-each sample-zips cadddr)
'("Boston" "Portland" "Hartford" "New York" ...)

Thus, as you’ve seen before in this class, you can write your own procedures that take procedures as parameters. We call procedures that take other procedures as parameters higher-order procedures.

Of course, the idea of passing a procedure as a parameter should be comfortable if you’ve been using map, reduce, sort, and similar functions In fact, map is likely to be defined much like apply-to-each (except that officially, the order in which map builds the result list is up to the implementer). We could even define our own version (and you may even have already done so yourself).

;;; Procedure:
;;;   proc-map
;;; Parameters:
;;;   proc, a procedure 
;;;   lst, a list 
;;; Purpose:
;;;   Applies proc to each value in a list.
;;; Returns:
;;;   newlst, a list 
;;; Preconditions:
;;;   proc can successfully be applied to each value in lst.
;;; Postconditions:
;;;   newlst is the same length as lst
;;;   For each i, 0 <= i < (length newlst)
;;;    (list-ref newlst i) = (proc (list-ref lst i))
(define proc-map
  (lambda (proc lst)
    ; If no elements remain, we can't apply anything else,
    ; so produce the empty list.
    (if (null? lst) 
        null
        ; Otherwise, apply the procedure to the first value, 
        ; apply the procedure to the remaining values, and 
        ; put the results back together into a list.
        (cons (proc (car lst)) 
              (proc-map proc (cdr lst))))))

Let us now return to the starting problems. What if we want to get the city from each element of the list?

> (proc-map cadddr sample-zips)
'("Boston" "Portland" "Hartford" "New York" ...)

What about when we want the list of longitude and latitude? We might use let to name the helper and then use proc-map with that.

> (let ([location (lambda (zip) (list (caddr zip) (cadr zip)))])
    (proc-map location sample-zips))
'((-70.919635 42.338947)
  (-70.470703 44.408078)
  (-72.718832 41.791776)
  ...)

Since we can write expressions like that, we would no longer need to write cities or locations.

That observation suggests a second important moral: Once you’ve defined a few higher-order procedures, like map, you can often avoid writing other procedures, since the higher-order procedures let you write more general expressions.

Higher-order predicates: Are all the elements of a list whatever?

There are many advantages to encoding design patterns in higher-order procedures. An important one is that it stops us from tediously writing the same thing over and over and over again. Think about writing the predicates all-integer?, all-irgb?, all-spot?, and so on and so forth. We’ve done so many times. However, as our colleague, John Stone, says (in reference to writing a sequence of similar procedures),

Writing and testing one of these definitions is an interesting and instructive exercise for the beginning Scheme programmer. Writing and testing another one is good practice. Writing and testing the third one is, frankly, a little tedious. If we then move on to [others], eventually programming is reduced to typing.

So, how do we avoid the repetitious typing? We begin with one of the procedures.

;;; Procedure:
;;;   all-int?
;;; Parameters:
;;;   lst, a list
;;; Purpose:
;;;   Determine if all of the values in lst are integers.
;;; Produces:
;;;   ok?, a Boolean
;;; Preconditions:
;;;   [Standard]
;;; Postconditions:
;;;   If there is an i such that (integer? (list-ref lst i))
;;;     fails to hold, then ok? is false.
;;;   Otherwise, ok? is true.
(define all-int?
  (lambda (lst)
    (or (null? lst)
        (and (integer? (car lst))
             (all-int? (cdr lst))))))

Next, we identify the parts of the procedure that depend on our current type (e.g., that everything is an integer).

(define all-WHATEVER?
  (lambda (val)
    (or (null? val)
        (and (WHATEVER? (car val))
             (all-WHATEVER? (cdr val))))))

Finally, we remove the dependent part or parts from the procedure name and make them parameters to the modified procedure.

;;; Procedure:
;;;   all
;;; Parameters:
;;;   pred?, a unary predicate
;;;   lst, a list
;;; Purpose:
;;;   Determine if pred? holds for all the values in lst.
;;; Produces:
;;;   ok?, a Boolean
;;; Preconditions:
;;;   [Standard]
;;; Postconditions:
;;;   If there is an i such that (pred? (list-ref lst i))
;;;     fails to hold, then ok? is false.
;;;   Otherwise, ok? is true.
(define all
  (lambda (pred? lst)
    (or (null? lst)
        (and (pred? (car lst))
             (all pred? (cdr lst))))))

Here’s how we might test whether something is a list of numbers.

> (all integer? (list 1 2 3))
#t
> (all integer? (list 1 "two" 3))
#f

We can also define all-int? using the all procedure.

(define all-int?
  (lambda (lst)
    (all integer? lst)))

or with

(define all-int? (section all integer? <>))

The results are the same.

> (all-int? (list 1 2 3))
#t
> (all-int? (list 1 "two" 3))
#f

Built-in higher-order procedures

We have seen that it is possible to write our own higher-order procedures. Scheme also includes a number of built-in higher-order procedures. You can read about many of them in section 6.4 of the Scheme report (r5rs), which is available through the DrRacket Help Desk. Here are some of the more popular ones.

The map Procedure

You already know about the basic use of the map procedure. You also know how one might implement it. It turns out that map has some additional capabilities. For example, you can apply a procedure to multiple lists (in which case it takes the corresponding value from each list).

> (map + (list 1 2 3) (list .5 .6 .7))
(1.5 2.6 3.7)
> (define aardvark-grades (list 98 75 90 80 100))
aardvark-grades
> (define baboon-grades (list 80 82 84 86 88))
baboon-grades
> (define cheetah-grades (list 50 95 50 95 50))
cheetah-grades
> (define best-grades (map max aardvark-grades baboon-grades cheetah-grades))
best-grades
> best-grades
(98 95 90 95 100)
> (define aardvark-scaled (map (lambda (x) (* 100 x)) (map / aardvark-grades best-grades)))
aardvark-scaled
> aardvark-scaled
(100 78.94736842 100 84.21052632 100)

The apply Procedure

One of the most important built-in higher-order procedures is apply, which takes a procedure and a list as arguments and invokes the procedure, giving it the elements of the list as its arguments:

> (apply string=? (list "foo" "foo"))
#t
> (apply * (list 3 4 5 6))
360
> (apply append (list (list 'a 'b 'c) (list 'd) (list 'e 'f)
                       null (list 'g 'h 'i)))
(a b c d e f g h i)

Unfortunately, since and and or are not procedures but keywords with their own evaluation rules, we can’t use them with apply.

> (map string? (list "alpha" 'beta "gamma"))
(#t #f #t)
> (and #t #f #t)
#f
> (apply and (map string? (list "alpha" 'beta "gamma")))
Error: eval: unbound variable: and 

It might seem odd to write a call to apply with these manually constructed lists when we could instead just call the requested function with the parameters directly. If that is the case, why have apply at all? The answer is, sometimes parameter lists themselves are not known at the time we write the program and are built on-the-fly, or else it is just plain easier to build-them on the fly.

As an example, consider the termial function from the reading on numeric recursion.

;;; Procedure:
;;;   termial
;;; Parameters:
;;;   number, a natural number
;;; Purpose:
;;;   Compute the sum of natural numbers not greater than a given
;;;   natural number
;;; Produces:
;;;   sum, a natural number
;;; Preconditions:
;;;   number is a number, exact, an integer, and non-negative.
;;;   The sum is not larger than the largest integer the language
;;;     permits.
;;; Postconditions:
;;;   sum is the sum of natural numbers not greater than number.
;;;   That is, sum = 0 + 1 + 2 + ... + number
(define termial
  (lambda (number)
    (if (zero? number)
        0
        (+ number (termial (- number 1))))))

As the documentation plainly tells us, the result is the sum 1 + 2 + 3 + ... + number As it turns out, we have a concise way to generate the list of these numbers by using iota. We therefore might just have well as used apply and iota to write termial, as follows.

(define termial
  (lambda (number)
    (apply + (iota (+ 1 number)))))

Returning procedures

Just as it is possible for procedures to take procedures as their parameters, it is also possible for procedures to produce new procedures as their return values. For example, here is a procedure that takes one parameter, a number, and creates a procedure that multiplies its parameters by that number.

;;; Procedure:
;;;   make-multiplier
;;; Parameters:
;;;   n, a number
;;; Purpose:
;;;   Creates a new procedure which multiplies its parameter by n.
;;; Produces:
;;;   multiplier, a procedure of one parameter
;;; Preconditions:
;;;   n must be a number
;;; Postconditions:
;;;   (multiplier v) = n * v
(define make-multiplier
  (lambda (n) ; n is the parameter to make-multiplier
    ; Return value: A procedure
    (lambda (v) 
      (* n v))))

Let’s test it out.

> (make-multiplier 5)
#<procedure>
> (define timesfive (make-multiplier 5))
> (timesfive 4)
20
> (timesfive 101)
505
> (map (make-multiplier 3) (list 1 2 3))
(3 6 9)

We can use the same technique to build the legendary compose operation that, given two functions, f and g, builds a function that applies g and then f.

;;; Procedure:
;;;   compose
;;; Parameters:
;;;   f, a unary function
;;;   g, a unary function
;;; Purpose:
;;;   Functionally compose f and g.
;;; Produces:
;;;   fun, a unary function.
;;; Preconditions:
;;;   f can be applied to any values g generates.
;;; Postconditions:
;;;   fun can be applied to any values g can be applied to.
;;;   fun generates values of the type that f generates.
;;;   (fun x) = (f (g x))
(define compose
  (lambda (f g) ; f and g are the parameters to compose
    ; compose returns a procedure of one parameter
    (lambda (x)
      ; that procedure applies g, and then applies f.
      (f (g x)))))

Here are some tests of that procedure.

> (define add2 (lambda (x) (+ 2 x)))
> (define mul5 (lambda (x) (* 5 x)))
> (define fun1 (compose add2 mul5))
> (fun1 5)
27
> (fun1 3)
17
> (define fun2 (compose mul5 add2))
> (fun2 5)
35
> (fun2 3)
25

Especially when using map, we often write anonymous procedures that look something like the following.

  (lambda (num) (* 2 num))

Even make-multiplier is actually something we might want to generalize. You’ll note that in that procedure, we filled in one parameter (*n) of a two-parameter procedure (*). In pattern form, we might write

  (lambda (val) (BINARY-PROC ARG1 val))

Let’s think about how we might turn that into procedures. (left-section binary-proc arg1) creates a new procedure by filling in the first argument of a binary procedure. (right-section binary-proc arg2) creates a new procedure by filling in the second argument of a binary procedure. We often abbreviate these two procedures as l-s and r-s.

In the following example, we define procedures that multiply their parameter by 2 or subtract 3 from their parameter, or some combination thereof.

> (define mul2 (left-section * 2))
mul2
> (define sub3 (right-section - 3))
sub3
> (map mul2 (list 1 2 3 4 5))
(2 4 6 8 10)
> (map sub3 (list 1 2 3 4 5))
(-2 -1 0 1 2)
> (map (compose mul2 sub3) (list 1 2 3 4 5))
(-4 -2 0 2 4)
> (map (compose sub3 mul2) (list 1 2 3 4 5))
(-1 1 3 5 7)
> (map (compose (l-s * 2) (r-s - 3)) (list 1 2 3 4 5))
(-4 -2 0 2 4)
> (map (compose (l-s * 2) (l-s - 3)) (list 1 2 3 4 5))
(4 2 0 -2 -4)

Okay, what does left-section look like? The definition is fairly straightforward.

;;; Procedures:
;;;   left-section 
;;;   l-s
;;; Parameters:
;;;   binproc, a two-parameter procedure
;;;   left, a value
;;; Purpose:
;;;   Creates a one-parameter procedure by filling in the first parameter
;;    of binproc. 
;;; Produces:
;;;   unproc, a one-parameter procedure 
;;; Preconditions:  
;;;   left is a valid first parameter for binproc.
;;; Postconditions:
;;;   (unproc right) = (binproc left right)
(define left-section
  (lambda (binproc arg1)
    ; Build a new procedure of one argument
    (lambda (arg2)
      ; That calls binproc on the appropriate arguments
      (binproc arg1 arg2))))
(define l-s left-section)

How is right-section defined? We leave that as an exercise for the reader.

Experienced Scheme programmers regularly use left-section and right-section, not only in calls to map and other higher-order procedures, but also in defining new procedures. For example, consider the all-int? procedure that we previously defined as

(define all-int?
  (lambda (lst)
    (all integer? lst)))

Here is an even more concise definition that takes advantage of l-s.

(define all-int? (l-s all integer?))

Self checks

Check 1: Reviewing definitions

a. What is the definition of a higher-order procedure?

b. Despite its length, this reading only introduced one new built-in Scheme higher-order procedure. What is it?

c. Why is apply considered a higher-order procedure?

Check 2: Reflecting on apply

a. The reading gives two versions of termial. Which do you prefer? Why?

b. Use a strategy similar to the updated termial to rewrite the procedure sum, which sums a list of numbers, using apply, rather than recursion.

c. If you got the previous check, you might now notice that you’ve written a procedure (sum) that merely calls another procedure (apply) with its first parameter fixed, a pattern that suggests we use sectioning for further abbreviation. See if you can rewrite sum once again so that it uses l-s (left-section) rather than lambda to define the procedure.

Hint: Here is another example of using this pattern to abbreviate a procedure definition.

(define double
  (lambda (number)
    (* 2 number)))

(define double (l-s * 2))