Suppose we are given an [unordered] array a of size
MAX, a number value, and an index i.
We want to compute three values:
rank(value, a) is the number of elements of a
that are less than or equal value
trank(value, a) or the truncated rank is the number of
elements of a that are less than value
indexedCount(value, i, a) is the number of elements in
a[0..i-1] that equal value.
a = [3, 1, 4, 1, 5, 9, 2, 6, 5, 3]. then
rank(5, a) is 8.
trank(5, a) is 6.
indexedCount(5, 4, a) is 0, and both indexedCount(5, 5,
a) and indexedCount(5, 7, a) are 1.
This lab seeks to compute the rank, truncated rank, and indexed count using multiple processes that are organized in a hierarchy.
While a single-process approach likely would be much simpler (and faster) for these specific tasks, much of the purpose of the lab is to illustrate a common strategy within parallel algorithms.
Suppose we have 2N - 1 processors available to us. We arrange
them within a full tree structure, as illustrated below (where
N is 5).
Notes:
2N - 1 processors altogether, then the
tree has N levels, and the last row has
2(N-1) processors.
2N-1 -
1.
i <= 2N-1 -
2, and processor Pi is in the last row if i >=
2N-1 - 1
For simplicity, suppose MAX is R*2N-1
for known integers R and N.
We want to compute the rank, truncated rank, and indexed count for a
given value and index i
We assign the bottom row of processors in the above tree to compute the
desired values for blocks of the tree. The first processor will examine
a[0..R-1], the second processor will examine
a[R..2R-1], the third processor will examine
a[2R..3R-1], and so forth. Thus, the jth processor
in the last row will a[R*(j-1)..R*j-1].
Within this array segment, the jth processor of the last row computes rank, truncated rank, and indexed count using a simple linear search.
From part A, the processes at the bottom of the processor tree know the desired results for various array segments. These values can percolate up through the tree as follows:
Each processor (not in the bottom row):
At the top of the hierarchy, process 0 prints the final results on the keyboard.
For simplicity, suppose that MAX, R, and N, are
given as program constants. (If you wish, 2N-1
and 2N - 1 also may defined as program constants.)
Also, for simplicity, suppose that the array a[0..MAX-1]
is stored in shared memory.
Conceptually, connections between processors would be made via pipes or sockets, although sockets are to be used in this laboratory exercise. (In a more practical setting, the sockets would allow communication among multiple machines, but work on multiple machines is not part of this exercise.)
Number the sockets from 0 through 2N - 1,
starting at the top of the tree and moving downward from left to right.
Use the numbering scheme for processor children and parents to determine the number of the pipe or socket going to a processors parent and to determine the number of the pipe coming in from each child.
Code the above algorithm in C, using sockets.
This document is available on the World Wide Web as
http://www.cs.grinnell.edu/~walker/courses/213.fa04/lab-par-search.shtml
|
created November 6, 2004 last revised November 14, 2004 |
|
| For more information, please contact Henry M. Walker at (walker@cs.grinnell.edu) |