Goals: This laboratory exercise provides practice with some simple parallel algorithms related to prefix sums, as discussed in class. A later laboratory exericse will consider more general parallel algorithms.
Process: You must work in groups of 2 or 3 for this lab, where the groups are chosen by the instructor. Individual work is not appropriate, since part of the point of the lab is to encourage group discussion.
Exercises for this Lab: Exercises 1-3 are edited versions of exercises from Chapter 2 of An Introduction to Parallel Algorithms, by Joseph JáJá.
For this lab you are to do exercise 1, either exercise 2 or 3 (your choice), and exercise 4.
Consider the following divide-and-conquer algorithm for computing the
prefix sums {si} of a sequence
x1, x2,
.... xn, where n is a power of 2.
Compute the prefix sums of the two subsequences
{x1, x2,
.... xn/2} and
{x(n/2)+1, x(n/2)+2,
.... xn} -- say,
{z1, z2,
.... zn/2} and
{z(n/2)+1, z(n/2)+2,
.... zn}.
Then, set si = zi for 1 ≤ i ≤
n/2 and si = zi + zn/2 for
(n/2) + 1 ≤ i ≤ n.
Write a non-recursive version of this algorithm following the same general syntax used in class.
What are the time and the work required by the algorithm?
What is the PRAM model (e.g., EREW, CREW, CRCW) needed?
Let A = (a1, a2, ..., an)
be an array of elements, and let j1 = 1 < j2
< ... < js = n be a set of indices. Consider the
problem of computing the array B = (b1, b2,
..., bn) such that bl =
aji, for ji-1 < l ≤
ji and 2 ≤ i &le s. For example,
the array B corresponding to A = (4, 7, -1, 9, 6, 15)
and j1 = 1 < j2 = 4 < j3 =
6 is given by B = (4, 9, 9, 9, 15, 15).
Develop a parallel algorithm for this problem whose running time is O(log n).
(Segmented Prefix Sums) We are given a sequence A = (a1,
a2, ..., an) of elements from a set
S, and a Boolean array B of length
n, such that b1 = bn = 1.
For each i1 < i2 such that
bi1 = bi2 = 1 and
bj = 0 for all i1 < j <
i2, we wish to compute the prefix sums of the subarray
(ai1+1, ..., ai2)
of A.
Develop an O(log n) time algorithm to compute all the corresponding prefix sums. Your algorithm should use O(n) operations and should run on the EREW PRAM.
Laboratory Exercise 10 implemented a parallel algorithm to compute
rank(value, a), trank(value, a), and
indexedCount(value, i, a). In each case, the algorithms would
run in order O(log n). Of course, with a sufficient number of
processors, one could (in principle) compute as many computations of
rank, trank, and indexedCount as
desired -- all within O(log n).
Use rank, trank, and indexedCount as
needed, so that elements of an array a can be placed into a sorted
array b with just one additional parallel step.
Hint: Consider the location where each element a[i]
should end up in b, based on rank,
trank, and/or indexedCount.
a to array b for Exercise 4.
This document is available on the World Wide Web as
http://www.cs.grinnell.edu/~walker/courses/213.fa04/lab-prefix.shtml
|
created November 22, 2004 last revised November 30, 2004 |
|
| For more information, please contact Henry M. Walker at (walker@cs.grinnell.edu) |