Project: Proposal

Assigned:
Tuesday, Apr 17, 2018
Due:
Monday, Apr 23, 2018 by 10:30pm
Collaboration:
You are required to work in groups of two or three, although I will cap the total number of groups at 11 in each section of the course. You are welcome to discuss project ideas with your peers, but the work you submit must be your own. You must cite any sources that significantly contribute to your project plans.

Overview

For the final project in this class, you will need to identify a problem you would like to solve, then apply at least two different problem solving strategies from class, listed below:

Problem Solving Strategies

  1. Concurrency (Task or Data Parallelism)
  2. Synchronization
  3. Virtual Memory
  4. Memory Management
  5. Persistent Storage
  6. Debugging
  7. Scheduling
  8. Networks/Distributed Systems

This list is not necessarily complete, so feel free to suggest additions if you think I’ve missed something.

The specific project is up to you, but you should be prepared to argue that you project uses two or more of these strategies in an interesting way. It is not sufficient to write a parallel program that just uses synchronization operations; you must choose a project that solves a problem using non-trivial techniques such as a clever or complex synchronized data structure or an interesting approach to organizing threads. While this is not a hard requirement, it will probably be easiest to come up with reasonable project ideas that focus on a problem specific to one of these strategies, then couple that with one or more other strategies in your solution.

One concern to keep in mind as you think about your project is that you will need to evaluate your system in your final report on the project. As you work on your projects we will look at several research projects, all of which contain an experimental evaluation. This may influence your plans for an evaluation, but at this stage you should make sure you select a project where there is an interesting quantity to measure, such as runtime performance, memory use, network time, bug detection rates, etc. Ideally, this quantity should be closely connected with how well your system is working.

Once you have identified a project, you will need to find a group and prepare a proposal for that project.

Groups

You are free to choose your own groups for the project, but we will spend some time in class identifying areas of common interest and thinking about potential groups. You are required to work in groups of two or three students, but I will cap the total number of groups in each section at 11; this is a hard cap, dictated by the available time for our presentations during the final exam slot.

I generally will not approve cross-section project groups, but if you have a special circumstance I am willing to discuss the possibility. A likely requirement for this is that you will need to select one section of the class that all group members will attend on Tuesdays until the end of the semester. If this is not possible, it is very unlikely that I would approve your cross-section group.

Proposal

Your project proposal must:

  1. Explain the problem you are planning to solve.
  2. Describe the approach you will take to solving this problem.
  3. Explain how this system draws on at least two of the above key areas.
  4. Give a rough description of your implementation plans.

While you are not required to use figures in your proposal, they may make it easier to explain your plans. Your implementation plan should describe the major components of the system, a rough timeline for the implementation, and discuss any potential challenges in the implementation. You may want to consider possible alternate solutions if your proposed strategy does not work. Your proposal should be 2–3 pages of single-spaced text with a reasonable font.

Project Ideas

Here are a few example projects that would meet the requirements. You are welcome to propose one of these projects if you like, but I strongly encourage you to think about projects beyond the list below. Whether you choose one of these projects or your own idea, your proposal must address each of the requirements above.

A. File Indexing and Search (Persistent Storage, Concurrency)
Recursively scan the contents of a directory in parallel to build an inverted index, which maps works to the files that contain them. Use this index to search for files matching a query. Bonus: use the inotify library to watch for file changes and update the index.
B. Network Load Balancer (Networks, Scheduling)
Build a system that forwards network traffic to a pool of machines to balance load. The nature of the load and traffic is up to you. You could use the sysinfo function to determine the load on each machine and direct traffic to the least overloaded computer.
C. In-Memory Caching System (Networks, Synchronization, Memory Management)
Build a system that gives programs a place to cache important values in memory rather than on disk. Your implementation could mirror that of the memcached tool, or use a different API. Your system should be able to operate in a limited amount of memory, so you will need to consider when to free cached values and which ones to evict from the cache.
D. Deadlock Detector (Debugging, Concurrency, Synchronization)
Use LD_PRELOAD trick to create a custom mutex that detects when a program is deadlocked. You could instead report when deadlock is possible because threads acquire locks in a different order, even if the deadlock doesn’t actually occur.
E. Software Transactional Memory (Concurrency, Virtual Memory)
Rather than using locks, some programs use transactions to coordinate updates between threads. When using transactions, threads attempt to make their updates without synchronization, then go to commit their changes. A thread may find that a value it read or wrote during the transaction has been changed by another thread; in that case, the thread must roll back and retry the transaction with the updated values. If there are no conflicts, the thread can commit its changes and move on. You could implement such a system by running a transaction in a forked process, then copying updated values back into the original process to commit changes.
F. Networked Multiplayer Worm! (Networks, Concurrency)
Use the POSIX sockets API to build a multiplayer worm game where players share the same board. You will need to decide which parts of the game to run on a server and which parts run on the clients.

Project Ideas from Class

G. GrinnellDocs
Shared document editing, with data exchanged over the network.
H. Distributed Password Cracker
The space of possible passwords is divided up across multiple machines that work together to crack a password.
J. Graph Database
This system stores data in the form of vertices and edges rather than row-by-row in a normal database. You would generally use such a system to perform computations over a graph like queries about reachability
K. Parallel find
Search for matching files in parallel.
L. Memory Leak Detector
Find memory that is no longer reachable, or was not freed at program exit.
M. Truly Parallel Threads
Extend the thread implementation from the Worm! lab to run program threads on top of more than one hardware thread so the code is actually parallel.
N. Web Browser
Implement a (small subset of a) web browser. Making requests and receiving responses is manageable, but displaying pages can be complex.
O. Worm! Tournament
All players use the same board. They player that lasts the longest or eats the most apples wins the round.
P. Web Crawler
Request and index web pages. Rank those pages in search results.
Q. Garbage Collector for C
Find unreachable memory and free it.
R. Distributed Shared Memory
Use virtual memory and networks to make it possible for programs on different computers to run as if they have shared memory.
S. Different Concurrency Models
Implement a system that provides some sort of difference concurrency model, such as an event-driven program.
T. Minimal GDB
Implement a small subset of a debugger’s functionality.
U. Segfault Helper
Create a tool to help users diagnose segfaults. For example, you could report when a segfault occurs in read-only string memory, then offer a more useful explanation than just “Segmentation Fault”.
V. Fun Games!
Implement some sort of simple game over the network.
W. Web Server with Encryption
Implement a simple web server that supports one or more cryptographic protocols.

Evaluation

Your project proposal will be evaluated using four criteria:

Writing Quality
Is the proposal written clearly using correct grammar and spelling?
Problem Statement
Does the proposal adequately describe the purpose of the proposed work? How will you know whather your project is successful? This is an opportunity to propose an evaluation for your system, provided you are able to implement it as you’ve planned.
Project Scope
Does the proposed work span at least two key areas of this course? Does the proposal adequately justify this?
Implementation Plan
Is the proposed implementation plan sufficiently detailed? Does the proposal identify key milestones, a reasonable timeline, and at least a few potential pitfalls?

In addition to the grade for your initial proposal, I will send initial feedback on your proposed project soon after the proposal is due. You will need to address any serious concerns within a week of receiving proposal feedback.