%CustomEntities; %CourseEntities; %CommonEntities; ]>
Keeping Track of Tasks with Restricted Access Collections Summary: We consider a few related data structures that are frequently used to collect values and provide value-by-value access to the collection. Together, these data structures are sometimes called restricted access collections. Individually, they are called queues, stacks, and priority queues. We also consider a common application of these data structures: keep track of tasks left to do.
Introduction: Deep Recursion, Revisited As you may recall, one of the topics we've been considering recently deep recursion, a type of recursion in which each procedure call may make multiple recursive calls to itself. (In contrast, traditional linear recursion, each procedure call makes at most one direct recursive call.) Deep recursion is particularly useful for processing trees, since we can recurse on both the left and right subtrees. For example, we might compute the size of a tree with the following procedure: In this case, there are either zero or two recursive calls per node. We've also used deep recursion to generate fractals, as in the fractal-rectangle procedure. A simplified version of that procedure follows. In this case, there are either zero or four recursive calls per procedure call. One of the issues associated with deep-recursive procedures, like fractal-rectangle, is that there are lot of hidden delayed procedure calls built up somewhere. For example, while we're in the middle of drawing the upper-left corner of a rectangle, we have some recursive calls built up for the rest of that smaller rectangle and some recursive calls built up for the rest of the rest of the bigger rectangle (and, if we have multiple levels of fractal recursion, even more recursive calls built up for intermediate rectangles). However, programmers do not have access to these recursive calls - once they've been set up, they will happen at some time determined by the computer (and by the structure of the program). Because the programmer does not have access to those delayed calls, it is not possible to modify them (say, to stop drawing midway through or to de-prioritize some activities) or even to check what is happening. As an alternate, some programmers make the remaining things to do an explicit parameter to the procedure, rather than implicit from the call structure of the procedure. In effect, the programmer needs a data structure to keep track of the tasks left do. Instead of making lots of recursive calls, a procedure will simply add tasks to the collection. When done with one call, the program will grab the next task and continue. We're done when there are no tasks left. Collections that provide the operations of add an element, get the next element, and remove that element are called restricted access collections. Restricted access means that the collection allows access to only one element at any given time. Interestingly, the policy by which we choose that one element (either to use or remove) can have a significant effect on the behavior of the procedure. We'll explore that behavior a bit in this reading and the corresponding lab.
Primary Procedures In many ways, these restricted access collections, used to collect tasks, are like the to-do lists that many of us maintain: We add tasks, we remove tasks as we do them, we check if we have tasks left. The big difference is that we'll be a little more structured in the access we provide to tasks. Let's first describe what we want the procedures to do and then come back to the implementation a bit later. One of our first decisions in thinking about collections of tasks is whether they should be mutable (as is the case with vectors) or whether we build new collections from old, by adding or removing elements (as is the case with lists). The common custom in Scheme programming is to avoid mutation as much as possible, so we will make collections non-mutable. Of course, that decision has some implications for how we remove elements. On paper, we might simultaneously find and remove the next task. In code, it's useful to separate those activities. Why? Well, consider what happens if we grab and remove a task. In effect, we'll need that procedure to return two values: the next task and the modified collection of tasks. Although it is possible to have a procedure return two values, it is simpler to separate the activities. Given those decision, there are five basic operations we can use for manipulating collections of tasks: create a new collection of tasks, add an element to the collection, get the next element to process, remove that element, and check whether or not the collection contains any more tasks to process. Since these collections have arbitrary size, the procedure to create new collections needs no parameters. To get the next element from the collection, we need only the collection. Note that different collections may have different policies for getting a value, so we do not specify the policy in this documentation. Why might different collections have different policies? Well, consider a typical task list. Some people do tasks in the same order in which they are added. Others associate priorities with their tasks. Others choose tasks randomly. Still others have more complex policies. We'll revisit those policies in a subsequent section. For now, we simply leave the particular decision of what element to get unspecified. We'll leave the same issue unspecified in collection.drop. Since some people prioritize their tasks, we'll include a priority in the collection.add procedure. In many cases, we'll ignore that priority. However, it's easiest to include it for the few cases in which we need it. Finally, we'll need a way to make sure that there are tasks left to process.
Using Restricted Access Collections Let's consider how we could use these techniques to make the collected recursive calls in simple-fractal-rectangle! explicit, rather than implicit. We'll explicitly keep track of a collection of the sub-rectangles left to draw. We can represent each sub-rectangle left to draw. draw as a six-element vector (color, left, top, right, bottom, level). Now, what do we do with that collection? We repeatedly grab the first rectangle and process it as appropriate (sometimes adding more rectangles to draw). For clarity, we'll use separate processes to do the repetition (kernel) and the processing of individual rectangles (process-rect). The kernel is relatively straightforward. (letrec ((kernel (lambda (tasks) (if (collection.empty? tasks) 'Success (kernel (process-rect (collection.get tasks) (collection.drop tasks))))))) ...) What should process-rect look like? For the base case (level is 0), we just draw the rectangle. For the recursive case, we need to build four tasks (well, descriptions of rectangles) and add each to the collection. (let ((midcol (round (* 0.5 (+ right left)))) (midrow (round (* 0.5 (+ bottom top)))) (nextlevel (- level 1))) (let* ((rect1 (vector (rgb.lighter color) left top midcol midrow (- level 1))) (rect2 (vector (rgb.darker color) midcol top right midrow (- level 1))) (rect3 (vector (rgb.darker color) left midrow midcol bottom (- level 1))) (rect4 (vector color midcol midrow right bottom (- level 1))) (c1 (collection.add remaining-tasks rect1 0)) (c2 (collection.add c1 rect2 0)) (c3 (collection.add c2 rect3 0)) (c4 (collection.add c3 rect4 0))) All that's left is to decide what rectangle the collection should start with. That's easy enough: We start with a collection that contains the whole rectangle. Putting it all together, we get You'll explore this version of the procedure in the lab.
Policies for Getting Values As we mentioned earlier, there are many different policies that can be used for selecting the next value to process. One of the most common policies is first-in, first-out (FIFO). In this policy, the value returned by collection.get is the first of the remaining values. That is, if we add A to the collection before B, we process A before B. Restricted access collections that implement a FIFO policy are typically called queues. Another common strategy is to assign a numeric priority to each value, and to remove values according to their priority. (Lower numbers represent higher priorities.) Restricted access collections that implement a priority policy are typically called priority queues. You'll note that neither of these is the strategy that Scheme seems to use when processing procedures. That is, if you've set up six recursive calls, and the first of those recursive calls does four more recursive calls, those newer calls are done before the other calls, rather than after (as in FIFO). There are also strategies that humans use that don't match the other two. For example, some people always put the mail they've just received on the top of their inbox, and process their inbox from top to bottom. In both cases, the strategy seems to be much more one of last in, first out (LIFO). Restricted access collections that implement a LIFO policy are typically called stacks. Queues, priority queues, and stacks are the three most common restricted access collections. We implement each in different ways.
Implementing Stacks We'll consider only stacks, and only one implementation of stacks. When implementing these kinds of collections, we might use lists, trees, or vectors. Since we're building non-mutable structures, and accessing the values in those structures one at a time, it seems that lists are the easiest possible underlying structure. A new collection is simply the empty list. To add something to the collection, we cons it onto the front (ignoring the priority). The next thing to process is therefore the car. Similarly, to delete something, we just take the cdr. And empty lists represent empty collections. Here's the code that puts all of those ideas together.
Selecting an Implementation The implementations of queues and priority queues is a bit more complex, so we will leave them until later. Suppose we've implemented stacks, queues, and priority queues. How do we decide which our program uses? If a procedure requires a particular kind of restricted access collection, we use those names (e.g., stack.add) rather than the generic names (e.g., collection.add). If we want to use the generic names, so that we can experiment with the strengths and weaknesses of various implementations, we can simply tell Scheme to treat things as equivalent. For example, if we want to use the stack implementation, we might write (define collection.new stack.new) (define collection.get stack.get) (define collection.drop stack.drop) (define collection.add stack.add) (define collection.empty? stack.empty?) In the lab, you'll use this technique to explore each kind of restricted access collection.