# Lab: Parallelism

Assigned
Wednesday, Dec 12, 2018
Due
Sunday, Dec 16, 2018 by 10:30pm
Submitting
Please upload your completed bank.c and map.c source files to gradescope.

## Overview

In this lab, you will practice using threads and locks to complete the implementation of two simple parallel programs. Both programs include test drivers that will help you determine whether your implementation is working correctly. There are some additional requirements that are not checked by the test programs, so please read the instructions carefully.

All the starter code is available in parallelism.tar.gz. Download and unpack this archive to set up for the lab. For example,

\$ tar xzf parallelism.tar.gz


## Background

The POSIX library contains a wide variety of portable routines for managing concurrency. While these are covered and utilized in far more detail in CSC 213, it’s still important to gain some facility with these notions, even if you do not end up taking that class. To that end, we have created a suite of routines that provide both locking and thread-launching, both of which you’ll need for this lab.

The “library” (a bit of a misnomer) is a header-only library, meaning that instead of compiling a separate object file and linking with it, you merely need to include the appropriate header, which contains all the code (both declaration and implementation) for a set of procedures that nicely wrap the underlying pthreads library.

Good, defensive C programming generally requires you to check the return codes from system or library calls and take action accordingly. The main utility of these wrapper procedures is they both exclude arguments you won’t need to worry about and handle any errors for you by reporting them and exiting your entire program.

To use the “mythreads” library all you need to do is #include "../mythreads.h" in your source files. (In fact, the starter files already have this for you, so you don’t need to do anything special other than call the functions we lay out for you below.)

### Locking with a Mutex

In parallel programs, a lock is used to protect what is often called a critical section. Because we want access to a critical section of code to be mutually exclusive (that is, only one thread should be running it at a time), the lock is called a mutex.

Using a lock (or mutex) involves four steps:

1. Declare the lock variable
2. Initialize the lock variable
3. Attempt to lock the mutex before entering the critical section
4. Unlock the variable when the critical section has ended

The following code snippet demonstrates how to do this in the “mythreads.h” library:

pthread_mutex_t deadbolt; // 1. Declare the lock variable
mutex_init (&deadbolt);   // 2. Initialize the lock variable

// ...

mutex_lock (&deadbolt);   // 3. Attempt to lock the variable
// Here is the critical section
mutex_unlock (&deadbolt); // 4. Unlock the mutex


Creating a new thread of computation in a shared memory system allows these multiple computations to proceed concurrently (and perhaps in parallel if the host system has multiple processors).

• passing arguments to the “worker” function that the new thread is to start running, and
• deciding what to do once the independent thread is launched

Recall that function pointers in C allow us to pass references to procedures we want to execute, much like functions such as map and reduce in Scheme do. Launching a thread involves giving both the name of the procedure (which must both take a void * argument as a parameter and produce a void * as its return type), and the value of the parameter the procedure should be given when it starts running in the new thread.

The following C function meets these criteria.

/* Function that expects a pointer to a single character */
void *
worker (void *  arg)
{
return NULL;
}


This function takes a pointer to any type of value, so before we use that pointer (e.g., by dereferencing it), we need to cast it to the type we expect to have gotten (in this case, a char *, or a pointer to a single character). Note that this worker function does not produce a value, so we safely return the pointer NULL because we do not expect any one to make use of it.

And launching three separate threads to run this simple worker would look like the following.

pthread_t thread1, thread2, thread3; // Declare the thread variable(s)
char arg1, arg2, arg3;

arg1='A';
arg2='B';
arg3='C';



If you ran this code several times, you would perhaps see the characters A, B, and C printed in different orderings, depending on how the threads were scheduled to the processors.

Be careful that you have created separate argument values! The following approach creates a problematic race condition. Make sure you understand why before you move on.

pthread_t thread1, thread2, thread3; // Declare the thread variable(s)
char arg;

arg='A';

arg='B';

arg='C';


#### Waiting for Threads to Finish

If the launch above was all there was to your program, the thread running main might actually finish before all of the younger threads completed, leading to some unexpected behavior. Therefore, many multi-threaded programs also ask the main thread to block further execution until your worker threads have completed. This can be done in the “mythreads” library with the thread_join function:

thread_join (thread1); // Will not return until thread1 is complete


Bear in mind that thread 3 may have completed before thread 1. In that case, the last call to thread_join will return immediately, not having to wait for completion. Note that these procedures do not take pointers, but rather the actual thread values themselves.

#### Passing Multiple Parameters

Note that thread_create only allows you to pass a single argument to the worker function. If your thread’s worker function needs more than one parameter, you have to pack them together into a struct and pass the pointer to the struct. To accomplish this, we typically define a struct specifically to hold the worker function’s parameters.

For example, if we wanted to pass both a character and a number to a worker, we could create a C struct and corresponding worker function as follows:

typedef struct {
char alpha;
int num;
} alphanum_t;

void *
worker (void * arg)  {
alphanum_t * args = (alphanum_t*)arg; // Cast the pointer to simplify dereferencing
printf ("Received %c and %d", args->alpha, args->num);
return NULL;
}


To launch this worker, we’d take a similar approach to above.

pthread_t thread1, thread2; // Declare the thread variable(s)
alphanum_t arg1, arg2;

arg1.alpha='A';
arg1.num = 1;

arg2.alpha='B';
arg2.num = 2;



## Part A: Bank

The bank program implements a simple banking simulation. The main bank implementation is in bank.c; this is where you will make all of your changes. The simple simulation allows users to create accounts, add transactions, transfer money between accounts, and access both the transaction count and current balance for each account. Each account is accessed from a different thread.

Your task for this part of the lab is to fix the bank simulation. If you run the test program using the command make test, you will see that the banking system tends to lose money and transaction counts. This is because the bank simulation does not use any synchronization operations. You should add locks to the banking system to make sure it does not lose any transactions. However, you may not use a single lock to protect the whole bank! You must allow independent transactions on separate accounts to proceed in parallel.

Note that account creation happens in the test serially before any of the other tests are run, so you do not need to perform any synchronization in the create_account procedure.

Hint: To avoid deadlock between multiple threads trying to acquire the same set of locks, you should ensure they attempt to acquire the locks in the same order.

Please do not change anything in the bank-test.c and bank.h files. You are only required to submit bank.c for this part. It will be tested it with the original versions of the other two files.

## Part B: Map

For the second part of the lab, you will implement a parallel version of the familiar map function from Scheme. The starter code for this part includes a serial implementation, along with a test program that will run your parallel implementation with 1, 2, 4, and 8 threads, and validate all the results.

Your job is to implement parallel_map. The output from the function must be correct, but you must also adhere to a few additional requirements:

1. You may not create and destroy more than num_threads threads in the function. That means you will need to have each thread run the mapped function over more than one entry.
2. You will have to divide the work between threads at least somewhat evenly. There are many schemes you can use to do this, but I recommend thinking carefully about how you divide work to minimize the amount of synchronization.
3. The main thread should just wait for worker threads to complete the parallel_map. A more advanced implementation might have the main thread do work as well, but please do not do that for this lab.
4. Your parallel implementation must run faster when you add more threads, at least up to four threads. Your run with 1 thread will likely be slower than the serial version of the program, but you should see an improvement when you go to two and four threads.