Fundamentals of Computer Science II (CSC-152 97F)

[News] [Basics] [Syllabus] [Outlines] [Assignments] [Examples] [Readings] [Bailey Docs]

# Outline of Class 49: Topological Sort

## Topological Sort

• Recall that the problem of topological sort is one of ordering (numbering) the nodes in a directed acyclic graph in such a way that if there is a nonempty path from A to B, then A appears before (after) B in the ordering (A has a smaller (larger) number than B).
• Here is a simple algorithm for topologically sorting a graph
```while not done
if the graph is empty, then we're done (exit the loop)
pick a node with no predecessors
if no such node exists , then the graph is cyclic (exit the loop)
output that node (number that node)
delete that node from the graph
end while
```
• In more Java-esque form
```/**
* Output the nodes in the current acyclic directed graph so that
* when there is a path from A to B, A is output before B.
* pre: The graph is acyclic
* post: The graph may be modified
* post: All nodes in the graph are output
* post: The output order follows the dependencies of the graph.
*
* @exception CyclicException if the graph is cyclic.
*/
void topologicalSort() throws CyclicException {
// If the graph is empty, then we're done
if empty() return;
// Find a node with no predecessors
Node small = fewestPredecessors();
// If no such node exists, the graph is cyclic
if (small.predCount() != 0) {
throw new CyclicException("Can't toposort cyclic graphs");
}
// Output that node
System.out.print(small.contents());
// Delete it from the graph
delete(small);
// And go on
topologicalSort();
} // topologicalSort
```

### Running Time

• What is the running time of our initial topological sort algorithm?
• It depends on the running time of some of the functions that our routines.
• In particular, we call `delete()` and `fewestPredecessors()` n times, where n is the number of nodes in the graph.
• Each `delete()` is likely to take time proportional to the number of edges incident to the deleted node.
• Each `fewestPredecessors()` is likely to take O(n) time in a graph structure not optimized for this operation.
• It could take even more, as it might even be necessary to count all the predecessors of a node, which will take O(m) time, where m is the number of edges in the graph.
• The total time for all `delete()`s will then be O(m), where m is the number of edges in the graph.
• The total time for all `fewestPredecessors()` will then be O(n^2).
• The running time of this algorithm is therefore O(n^2 + m).
• Since m is bounded above by n^2, the running time of this algorithm is O(n^2).
• Can we make it faster? Certainly. We can directly attack these two operations or we can rethink or original structure.
• While it might seem that we could make `fewestPredecessors()` take less time by using a heap, it turns out that this isn't as good a solution as it might seem because we need to update the predecessor count of a number of nodes each time we delete a node.

### An Improved Implementation

• We can improve the running time of topological sort by choosing an appropriate representation and supplementing the representation with some additional structures.

• In this implementation of topological sort, we'll represent the graph with adjacency lists. We'll associate a list of neighbors (nodes reachable in one step) with each node.
• To improve identification of nodes without predecessors, we'll keep a list of all such nodes in a linear structure (e.g., a stack or a queue).
• Here's our improved implementation, in pseudo-Java
```/** See description above. */
public void topologicalSort() {
// Variables
Dictionary pred_count = new Dictionary();
// Predecessor count of each node
Linear no_pred = new Linear();
// Nodes with no predecessors
Edge e;			// The next edge to deal with
Object source;		// Source of directed edge
Object sink;			// Sink of directed edge

// Fill in the adjacency list
for(Enumeration edges = this.edges(); edges.hasMoreElements();) {
e = edges.nextElement();
source = e.source();
sink = e.sink();
// If the source or sink isn't in the dictionaries, add it
pred_count.put(source, new Counter());
}
pred_count.put(sink, new Counter());
}
// Update adjacency list for source
// Update predecessor count for sink (not real Java)
((Counter) pred_count.get(source)).increment();
} // for

// Identify all nodes with no predecessors
for(Enumeration nodes = pred_count.keys(); nodes.hasMoreElements(); ) {
source = nodes.nextElement();
if (((Counter) pred_count.get(source)).isZero()) {
}
} // for

// As long as there are nodes with no predecessors, output and
// "delete" them
while (!no_pred.empty()) {
source = no_pred.next();
source.output();
neighbors.hasMoreElements(); ) {
sink = neighbors.nextElement();
((Counter) pred_count.get(sink)).decrement();
if ((Counter.pred_count.get(sink)).isZero()) {
}
} // for
} // while

// Are there nodes remaining?  If so, it was a cyclic graph.
for(Enumeration nodes = pred_count.keys(); nodes.hasMoreElements(); ) {
source = nodes.nextElement();
if (!pred_count.get(source).isZero()) {
throw new CyclicException();
}
} // for
} // topologicalSort()
```
• I'm relying on a nonexistant `Counter` class that we can use to count references. It should be trivial to implement that counter.

### Running Time

• What is the running time of this improved implementation?
• During construction, we do O(m) insertions, where m is the number of edges.
• In the initial search for nodes with no predecessors, we do O(n) steps looking at each node.
• The core loop takes O(n) steps.
• The error checking loop takes O(n) steps.
• So, the algorithm takes O(n + m) steps.
• In most graphs, n is bounded by m, so we might simplify this to O(m).
• However, it is better to leave it in the original form in case we deal with graphs with very few edges.

## Shortest Path

• Let's turn to the problem of finding the shortest path between two nodes in a graph.
• As we saw last class, a path, p between A and B is a sequence of edges with one node of each edge being on each subsequent edge and wit A in the first edge and B in the last edge.
• In a directed graph, the source of edge i is the sink of edge i-1.
• If there is a cost function, f, associated with paths, then the shortest path is a path, p, such that for all paths q f(p) <= f(q)
• There are a variety of useful cost functions
• The number of edges in the path
• The sum of the edge weights in the path
• The maximum/minimum edge weight in the path
• ...
• For some of these functions, it may be necessary to restrict edge weights or limit the types of permissible paths.
• For example, if edge weights can be negative and paths can include cycles, then there may be no "minimum sum" path in the graph.
• Hence, we may limit ourselves to cycle-free paths or nonnegative edge weights (or even both).

Outlines: prev next

[News] [Basics] [Syllabus] [Outlines] [Assignments] [Examples] [Readings] [Bailey Docs]

Disclaimer Often, these pages were created "on the fly" with little, if any, proofreading. Any or all of the information on the pages may be incorrect. Please contact me if you notice errors.