CSC153 2004S, Class 50: Priority Queues Admin * Lots to cover today; avoid distractions * HW7 due. * HW8 (final homework) assigned * Cite stuff you find on the Web. * No exam 4! * Cool convo today: DNA Evidence * Cool protest today: Light polution ("We want heavy polution!") * A travesty of a play tonight. Overview: * Priority queue basics * Simple implementations * Detour: Binary trees * Heaps * Implementing heaps * Heapsort Priority Queues: * Not queues (which have a FIFO policy) * Are linear structures * The value returned by get and peek is determined by a "priority" assigned to each value in the queue * Techniques for assigning values: * Attach a "priority" number to each value in the queue * Use a Comparator (like an Order) to compare things (slightly more general) Observation: * Priority queues can be a nice technique for sorting. * A good implementation of priority queues may lead to a good sorting algorithm. How do we implement priority queues? * Use a dynamic array or a linked list and keep the values in sorted order. add - O(n) get - O(1) if done well [Effectively insertion sort] * Use a dynamic array or linked list and store the values in no particular order. add - O(1) get - O(n) to find the smallest value How do we normally improve/design algorithms? * Brute force * Greediness * Dynamic programming (store intermediate values in a table) * Divide and conquer * Binary search * Quicksort * Mergesort * We can also apply divide and conquer to data structure design Typical d-a-c data structure: The tree (much like a family tree) * A tree is built from node-like things * Each has an associated value * Each has zero or more "children", which are also nodes/trees * In the most general case, there is no relationship between the nodes (or their values) other than the "child" relationship. * However, we often add other relationships. Some terminology: * root: The starting point of the tree. * leaf: Nodes with no children * interior node: something neither a root nor a leaf * parent: inverse of child relationship * [forest: a collection of trees (should not share nodes)] * [graph: like a tree, but nodes can be shared and we use different terms for the relationships] * descendant: child, child of child, child of child of child, etc. * ancestor: inverse of descendant relationship * sibling: a node with the same parent * depth: the number of nodes in the longest path from root to leaf * level: all nodes the same distance from the root * binary tree: a tree in which all the nodes have <= 2 children * complete binary tree: a tree of n levels in which the nodes at levels 0 ... n-1 all have two children * nearly-complete binary tree: a tree of n levels in which the nodes at levels 0 .. n-2 all have two children and, at level n-1, if a node has one or two children than all of its left siblings have two children * Informally: Everything but the last level is full; the last level is "shoved to the left" * nodes are ungendered Heap: An implementation of priority queues with binary trees * The tree must be nearly complete (depth of the tree is as small as possible; O(logn) a tree of k levels has 2^k-1 nodes * The tree must have the heap property * The priority of each node is at least as high as the priority of its children How long does it take to find the highest priority element in a heap (peek)? * O(1) because the root is the highest priority element How do we add something to a heap? * Maintain near completeness - Add to the leftmost part of the bottom level * Restore the heap property by repeatedly swapping the value "up" the tree if it has a higher priority than its parent Theorem (unproven): If it was a heap to start with, it will be a heap when we're done * O(logn) How do we remove the highest-priority element from a heap? * Maintain near completeness by putting the rightmost element at the lowest level at the root * Restore the heap property by repeatedly "swapping down" with the larger of the two children * O(logn) Problem in our analysis: How do you find the leftmost free position or the rightmost element? * Do math to figure it out * Keep a current pointer to the next free position in the tree (or the last filled position, or both): Upon addition: If you are the left child of your parent An interesting variant: Shove the values in an array/vector using a breadth-first left-to-right numbering of the positions * Left child of node at position k is at position: 2k+1 * Right child of node at position k is at position: 2k+2 * Parent of a node at position k is at position (k-1) div 2 * This makes it much easier to find the next free space (just keep track of the number of values in the array/Vector) * This saves all the child/parent pointer crap. Note that you can take an existing array and turn it into a heap fairly quickly * Grow the heap from left to right * Build a forest and repeatedly join