CSC297.Java, Class 31: Linear Structures Admin: * Hi Alex, are you watching? * Review of Homework * Alex meets at 8:30 on Friday to catch up; Yvonne at 9:00 * New homework: * Implement queues using the arrays and the "loop around to the front" technique Overview: * Collections, Revisited * Linear Structures * Queues * Stacks * Priority Queues * ... What are Collections, at least in the Java Collections Framework? * A bit like lists * Sun sometimes suffers from feeping creaturism (creeping featurism) * To get the best implementation of an data type, limit its features * At least three ways we use collections: * As things we can iterate (shove stuff in, look at the stuff one by one) * As things we can simply add to and delete from: Two basic operations are add and get * As things in which we look for values by key (e.g., association lists in Scheme) * Similar: Group of stuff (e.g., a "Collection") * However, they are very different applications of collections We will focus on the "second" variant today. These are typically called "linear structures". * Basic operations * create a new linear collection * void add(Object) add something to the linear collection * Object remove() remove something from the linear collection (no parameters; it's the "natural thing" to remove) * Object peek() check what you would remove next, but don't remove it * boolean isEmpty() is there anything left * What does remove() remove? You can choose a "policy" for removal. * RandomLinearStructure: "Some element" is removed. That element is unpredictable. * Queue: Elements are removed in the order in which they are added. * Models a "line" or "queue" at a store * Provides "fair" treatment * Stack: Elements are removed in the opposite order to which they are added (Last In, First Out) * Models the plates in a cafeteria * Useful for replicating the behavior of recursive procedures * Priority Queue: Elements are removed in order of "importance" (requires a Comparator to determine "importance") Some Questions: * Are these interfaces or classes? All seem like interfaces. * Linear * RandomLinear * Queue * Stack * PriorityQueue * How should we implement them? * Using linked nodes. * Using doubly-linked nodes. * Using arrays or vectors. * Implementing stacks using vectors * Put the newest thing at the end * Remove the newest thing from the end * Implementing stacks using nodes and links * Add to end and remove from end * OR Add to beginning and remove from beginning (better) * Implementing queues using nodes and links * Keep track of front and end * Add to the end and remove from the front * Implementing queues using arrays or vectors? * Bad strategy: get() deletes the first element and then you shift * Better strategy: get() deletes the "first" element and you change your notion of where the first element is. * If you use arrays and keep adding, you might then "wrap" back to the beginning as you add to the end * If you use arrays and add more stuff than you've deleted, you eventually need to copy everything into a bigger array How efficient are these implementations? * add() takes? O(1) * get() takes? O(1) How might we implement priority queues? (We'll assume that the "smallest" is the "highest priority") * Strategy one: Use a vector and our favorite "findSmallest" operation * add() simply shoves on the end, O(1) * get() uses findSmallest and looks at all elements, O(n) * Strategy two: Use a vector and store the elements in sorted order. (Like the "insert" from insertion sort" * add() takes O(n) to find and insert * get() takes O(1) * Can we do better? Perhaps O(1) perhaps O(log_2(n))? * When we've gotten O(log_2(n)) for algorithms, what strategy have we normally used? Divide and conquer! * We're going to design a divide and conquer data structure * One divide-and-conquer structure: The binary search tree * The root is the "middle" value * Everything to the left is smaller * Everything to the right is larger * The subtree to the left is also a binary search tree * The subtree to the right is also a binary search tree * To find the smallest, keep going to the left until you find the smallest thing * If the tree is balanced, each findSmallest is O(log_2(n)) * If the tree is unbalanced (e.g., after you delete lots of small things), it may be O(n) * Balancing binary search trees is complicated! Take 301 for more info. * Observation: Binary search trees may be overkill for our purposes. * Since all we care about is the smallest, can we find a structure that is easier to keep balanced? * Solution: The Heap * A heap is a tree that * (a) Is nearly complete. Every level of the tree except the last has all the elements. The last level has everything "shoved to the left" * (b) Has the heap property. The root is the "smallest" element. Each subtree has the heap property. * To add(): * shove it at the place that maintains "nearly complete" O(1) * swap it up through the tree until it is larger than the thing above it (or reaches the root) O(log_2(n)) * To peek() * return the root; do not delete it * To get() * return the root; delete it; and then take the last thing on the last level and shove it in the root; then repeatedly swap down using the smaller of the two children O(log_2(n)) * Observation: This can be used for sorting * Shove everything into the heap O(n*log_2(n)) * Remove everything (in order!) O(n*log_2(n)) * Ta da! Heap sort * Implemented on Friday