CSC 213:  Operation Systems and Parallel Algorithms

Laboratory Exercise on Parallel Prefix-Sum Algorithms

Goals: This laboratory exercise provides practice with some simple parallel algorithms related to prefix sums, as discussed in class. A later laboratory exericse will consider more general parallel algorithms.

Process: You must work in groups of 2 or 3 for this lab, where the groups are chosen by the instructor. Individual work is not appropriate, since part of the point of the lab is to encourage group discussion.

Exercises for this Lab: Exercises 1-3 are edited versions of exercises from Chapter 2 of An Introduction to Parallel Algorithms, by Joseph JáJá.

For this lab you are to do exercise 1, either exercise 2 or 3 (your choice), and exercise 4.

  1. Consider the following divide-and-conquer algorithm for computing the prefix sums {si} of a sequence x1, x2, .... xn, where n is a power of 2. Compute the prefix sums of the two subsequences {x1, x2, .... xn/2} and {x(n/2)+1, x(n/2)+2, .... xn} -- say, {z1, z2, .... zn/2} and {z(n/2)+1, z(n/2)+2, .... zn}. Then, set si = zi for 1 ≤ i ≤ n/2 and si = zi + zn/2 for (n/2) + 1 ≤ i ≤ n.

    1. Write a non-recursive version of this algorithm following the same general syntax used in class.

    2. What are the time and the work required by the algorithm?

    3. What is the PRAM model (e.g., EREW, CREW, CRCW) needed?

  2. Let A = (a1, a2, ..., an) be an array of elements, and let j1 = 1 < j2 < ... < js = n be a set of indices. Consider the problem of computing the array B = (b1, b2, ..., bn) such that bl = aji, for ji-1 < l ≤ ji and 2 ≤ i &le s. For example, the array B corresponding to A = (4, 7, -1, 9, 6, 15) and j1 = 1 < j2 = 4 < j3 = 6 is given by B = (4, 9, 9, 9, 15, 15).

    Develop a parallel algorithm for this problem whose running time is O(log n).

  3. (Segmented Prefix Sums) We are given a sequence A = (a1, a2, ..., an) of elements from a set S, and a Boolean array B of length n, such that b1 = bn = 1. For each i1 < i2 such that bi1 = bi2 = 1 and bj = 0 for all i1 < j < i2, we wish to compute the prefix sums of the subarray (ai1+1, ..., ai2) of A.

    Develop an O(log n) time algorithm to compute all the corresponding prefix sums. Your algorithm should use O(n) operations and should run on the EREW PRAM.

  4. Laboratory Exercise 10 implemented a parallel algorithm to compute rank(value, a), trank(value, a), and indexedCount(value, i, a). In each case, the algorithms would run in order O(log n). Of course, with a sufficient number of processors, one could (in principle) compute as many computations of rank, trank, and indexedCount as desired -- all within O(log n).

    Use rank, trank, and indexedCount as needed, so that elements of an array a can be placed into a sorted array b with just one additional parallel step.

    Hint: Consider the location where each element a[i] should end up in b, based on rank, trank, and/or indexedCount.

Work To Be Turned In


This document is available on the World Wide Web as

     http://www.cs.grinnell.edu/~walker/courses/213.fa04/lab-prefix.shtml

created November 22, 2004
last revised November 30, 2004
Valid HTML 4.01! Valid CSS!
For more information, please contact Henry M. Walker at (walker@cs.grinnell.edu)