[Instructions] [Search] [Current] [Syllabus] [Links] [Handouts] [Outlines] [Labs] [More Labs] [Assignments] [Quizzes] [Examples] [Book] [Tutorial] [API]
In this laboratory session, you will investigate a number of related algorithms. Initially, you will consider some algorithms used to find the smaller elements of an array. You will then consider ways to enhance these algorithms in order to sort arrays (place the elements in order).
The goals of this laboratory session are to:
Your instructor will tell you which of the proposed experiments you are to perform.
Prerequisite skills:
Required files:
If you've done any reading about algorithm analysis, you've learned that computer scientists tend to analyze algorithms in terms of an upper bound on their expected or worst-case running times, and that they express those running times in terms of an unknown constant times a function of the size of the input. Such running times are written O(the function) and pronounced ``big O of the function'' or ``order the function''.
For example, if we were deleting the smallest element of an array, we might say that it takes O(n) time, where n is the number of elements in the array. Why? There may be cases in which it takes less. For example, if we know the smallest element is at the end of the array, then we can probably delete it in one step. But how do we determine that it's the smallest element? Usually by comparing it to all the other elements. In addition, if we don't want to leave gaps in the array, we may need to shift all the elements left one space. If we end up deleting the leftmost element, that's another n ``steps''.
When we say that a method requires O(f(n)) steps, we mean that there is
some constant, c such that the number of steps for an input
of size n is never more than (but sometimes less than) c*f(n), no matter
how we count as steps. (The choice of c may depend on our definition
of ``step''.) Different methods with the same big-O running time may have
very different constants. At the same time, choice of a different
algorithm with a ``smaller'' function can have a much bigger impact on
the actual running time of an algorithm, even when the smaller function
has a larger constant.
In the following discussion and subsequent experiments, we will investigate these issues in more depth.
Suppose you were asked to find the smallest element of a sequence. You might do this be assuming that the first element is the smallest and then stepping through the remaining elements, updating your estimate of the smallest whenever you found a smaller element. In pseudocode,
guess = the first element of the sequence
for each remaining element of the sequence, e
if (e < guess) then
guess = e;
end if
end for
Using arrays in Java, we might express this as
/**
* Compute the smallest element in the sequence.
*/
public int smallest() {
// Our guess as to the smallest element
int guess = this.elements[0];
// A counter variable
int i;
// Look through all subsequent elements
for (i = 1; i < this.elements.length; ++i) {
// If the element is smaller than our guess, then
// update the guess
if (this.elements[i] < guess) {
guess = this.elements[i];
} // if
} // for
// That's it, we're done
return guess;
} // smallest()
As a variation, we might write an indexOfSmallest that
returns the index of the smallest element in a subsequence? Why
would we want such a method? As you've seen, whenever your write a
method for a sequence, it is helpful to write a similar method for
a subsequence. Why return an index rather than the actual value?
Because it will be helpful for the subsequent experiments.
If we decided to generalize this, we might change it to an
/**
* Compute the index of the smallest element in the subsequence
* given by lower bound lb and upper bound ub.
*/
public int indexOfSmallest(int lb, int ub) {
// Make sure the upper bound and lower bound are reasonable.
if (lb < 0) {
lb = 0;
}
if (ub >= this.elements.length) {
ub = this.elements.length - 1;
}
// Our guess as to the index of the smallest element
int guess = lb;
// A counter variable
int i;
// Look through all subsequent elements
for (i = lb + 1; i <= ub; ++i) {
// If the element is smaller than our guess, then
// update the guess
if (this.elements[i] < this.elements[guess]) {
guess = i;
} // if
} // for
// That's it, we're done
return guess;
} // indexOfSmallest()
In experiment L3.1 you will investigate these two methods in a little more depth. You will also begin to examine the three classes you will use in the remaining experiments.
Suppose you were instead asked to find the two smallest entries in a sequence. One question you might ask would be ``How should I return two values?'' So that we need not concern ourselves with that question, let us instead try to move the two smallest entries to the first two positions of the array.
One approach would be to look through the sequence to find the smallest
entry and move it to the front of the sequence, then look through all but
the first element of the modified sequence for the next smallest element.
Using the indexOfSmallest method described above, we might
phrase this as
/**
* Put the two smallest elements of the sequence at the beginning
* of the sequence. The sequence must have at least two elements.
*/
public void twoSmallest() {
// Swap the initial element with the smallest
swap(0, indexOfSmallest(0, this.length()-1));
// Swap the next element with the smallest remaining
swap(1, indexOfSmallest(1, this.length()-1));
} // twoSmallest()
As you might guess, swap(i,j) swaps the elements at positions
i and j in the sequence.
Now, how might we put the five smallest elements in a sequence of 50 elements at the front of that sequence? One approach would be to comb through the sequence to find the smallest entry and move it to the front of the sequence. Next, you could comb through the 49 entries following this newly positioned entry to find the next smallest entry and move it to the position following the smallest entry. By repeating this process three more times, each time finding the smallest entry remaining in the sequence and placing it just behind the entry found in the previous pass, you will have placed the five smallest entries at the beginning of the sequence in increasing order of size.
When turning this narrative into code, it is appropriate to use a loop (since the five pieces are quite similar). For example,
/**
* Put the five smallest elements of the array at the beginning of
* the array (naive method). The sequence should have at least
* five elements.
*/
public void fiveSmallest() {
int i;
// For each index i from 0 to 4,
for (i = 0; i < 5; ++i) {
// Swap the smallest element in [i .. last] with the ith element.
swap(i, indexOfSmallest(i, this.length()-1));
} // for
} // fiveSmallest()
What we have accomplished is a partial sorting of the sequence by selecting the smallest entries. Thus, we call this algorithm the partial selection sort.
Our task now is to analyze the efficiency of this approach. This we do in terms of the number of times two entries in the sequence are compared. To find and position the smallest entry in the sequence requires 49 comparisons, to process the next smallest entry requires 48, and so on. Thus, the total number of comparisons to find the five smallest entries is
49 + 48 + 47 + 46 + 45 = 235
In general, applying this selection method to find the k smallest entries in a sequence of n entries requires
(1/2)(2*n*k - k2 - k)
comparisons. (Can you derive this formula?) Thus, to find the 10 smallest entries in a sequence of 10,000 entries requires 99,945 comparisons. We might also say that this is an O(n*k) algorithm.
Can we do better? Recall that when we found the smallest element in a sequence, we began with a guess of the smallest and then refined that guess by looking at the remaining elements. We can do the same thing to find the five smallest elements. Initially, we'll assume that the first five elements are the five smallest elements. Sort the first five entries in the sequence by any method. Then consider the sixth entry. Compare it to the fifth entry in the sequence, which is now the largest of the first five entries. If the sixth entry is larger, pass over it because it is not one of the five smallest entries. If, however, the sixth entry is smaller than the fifth, compare it with the fourth, third, and so on, inserting it among the first five entries so that the first five entries in the list remain the smallest entries found so far in increasing order. Repeat this process for the entries in positions 7, 8, ..., 50.
Note that this process creates a partially sorted list by inserting the smallest entries into the beginning of the list. Thus, we call this approach the partial insertion sort. To see why the partial insertion sort is superior to the partial selection sort, let us compare the two approaches when searching for the 10 smallest entries within a list of 1,000 entries. We suppose that our partial insertion sort has reached the halfway point. The 10 smallest items in the first 500 have been found and we are about to consider the entry in position 501. If the original list was randomly scrambled, it is unlikely that this entry will be less than the tenth entry, and only one comparison is required to discover this. The same is true for all the entries in positions 501 through 1000. However, if one of these entries does belong among the top 10, then this will be discovered with one comparison, and at most nine more comparisons will be required to position it properly. This is much more efficient than our partial selection sort, in which each of the last 500 entries is involved in 10 comparisons.
In experiment L3.2, you will investigate these two algorithms.
For our simple experiments, it is useful to be able to generate ``random''
sequences of numbers. What do we mean by ``random''? Typically, that
each sequence of a particular size is equally likely or that at each point
in the sequence, each number is equally likely as the next element. How can we generate
such sequences? Fortunately, Java provides a standard utility class,
java.util.Random. This class includes a nextInt
method that gives that next ``random'' number in the sequence. In truth, this
number is not random, in that it is generated by an algorithm. However, it is
close enough to random for our purposes.
Hence, to fill the array elements with a random sequence of 100
integers, we might write
import java.util.Random;
...
int i;
Random generator = new Random();
elements = new int[100];
for (i = 0; i < 100; ++i) {
elements[i] = generator.nextInt();
} // for
However, when we're comparing two algorithms, it is helpful to have the same input to both algorithms. Fortunately, Java's random number generator can take a seed that uniquely determines the random sequence. You can think of a seed as being a number for the sequence. If you use the same seed, you end up with the same sequence. For example, to get the ``first'' sequence, you would write
Random generator = new Random(1);
Note that random sequences are not always the best test cases for your algorithms. For example, when testing a sorting algorithm, you should also test sequences of varying lengths, sequences which contain all the same value, presorted sequences, and ``backwards'' sorted sequences (in which the numbers are organized largest to smallest). Nonetheless, random sequences still serve many purposes, and are often a good starting point.
In experiment L3.3 you will investigate random number generators.
By requesting the partial selection sort to find the n smallest entries in a list of n entries, we obtain an algorithm, known as the selection sort, for sorting an entire list. This algorithm first finds the smallest of the n entries of the list, requiring n - 1 comparisons, and places that entry at the top of the list. Next, the algorithm finds the smallest entry among the remaining n - 1 entries, requiring n - 2 comparisons, and moves it to the second position in the list. This process repeats until all the entries are in order. The entire process requires
(n - 1) + (n - 2) + ... + 2 + 1 = (n - 1)(n/2)
or
(1/2)(n2 - n)
comparisons between list entries when sorting a list of length n. In big-O notation, the running time is O(n2).
A similar analysis shows that insertion sort requires an average of
(1/4)(n2 - n)
comparisons to sort a list of n entries. Again, in big-O notation, the running time is O(n2).
In experiment L3.4, you will consider insertion sort. In experiment L3.5, you will consider selection sort. In experiment L3.6, you will compare the two.
In the 1960's C. A. R. Hoare, a pioneer in the field of computer science, discovered the Quicksort algorithm. In the average case, the number of comparisons performed by this algorithm when sorting a list of n entries is O(n*lg(n)). However, in the worst case, Quicksort is also O(n2).
You will investigate the running time of Quicksort in experiment L3.4.
Name: ________________
ID:_______________
Required files:
Step 1.
Make copies of Counter.java, SortableIntSeq.java,
and SortTester.java. Compile all three and execute
SortTester. Find the smallest element in a list of size 50.
Describe what SortTester does (or can do).
Step 2.
One problem with SortTester and SortableIntSeq is
that they do not provide an easy way to count the steps in an algorithm. How
should we do that? Preferably with a Counter object. Read the
code for that class and explain what it does.
Step 3.
Build a new version of the smallest method from SortableIntSeq
that takes a Counter as a parameter and uses that counter to count the
steps it executes. Recompile SortableIntSeq and correct any errors.
Summarize your changes.
Step 4.
Extend SortTester so that it uses a Counter to count
the steps in SortableIntSeq's smallest method.
Recompile SortTester and correct any errors.
Summarize your changes.
Step 5.
Execute SortTester and record the number of steps required to
find the smallest element in lists of size 10, 20, 100, and 1000.
10: 20: 100: 1000:After recording your results, you may want to look at our notes on this step.
Required files:
Step 1.
Make copies of Counter.java, SortableIntSeq.java,
and SortTester.java. Compile all three and execute
SortTester. Find the five smallest elements in a list of size 50.
Record the results.
Step 2.
Update SortableIntSeq so that fiveSmallest,
newFiveSmallest, and any methods they use take
Counters as parameters and count their steps. Update
SortTester to call those methods with a Counter
and print out the number of steps executed. Recompile both files and
correct any errors. Summarize your changes.
Step 3.
Use your modified SortTester to fill in the following table.
Steps to find the smallest five elements in a list of size n, using
naive partial selection sort and the better partial insertion sort.
n steps steps
(naive) (improved)
500
1000
2000
Step 4.
Update SortableIntSeq and SortTester to look for the
seven smallest elements, rather than the five smallest elements. Fill in the table.
Steps to find the smallest seven elements in a list of size n, using
naive partial selection sort and the better partial insertion sort.
n steps steps
(naive) (improved)
500
1000
2000
Step 5.
Add kSmallest and newKSmallest methods to
SortableIntSeq. These will behave like fiveSmallest
and newFiveSmallest so that they
take k (the number of small elements to find) as a parameter.
Recompile SortableIntSeq and correct any errors. Summarize your changes.
Step 6.
Update SortTester so that it reads in the number of elements to
find (in the cases in which we want k small elements). Recompile
SortTester and correct any errors. Summarize your changes.
Step 7.
Using your augmented SortTester, record the number of steps
for each of the following
Steps to find the smallest k elements in a list of size n, using
naive partial selection sort and the better partial insertion sort.
k n steps steps
(naive) (improved)
5 500
10 1000
15 2000
Step 8.
Step 2. Repeat step 7 for the following table. In these cases you are selecting the top 10 percent of the list, while in step 7 you selected the top 1 percent.
Steps to find the smallest k elements in a list of size n, using
naive partial selection sort and the better partial insertion sort.
k n steps steps
(naive) (improved)
25 250
50 500
100 1000
200 2000
Step 9.
Do those number match those theorized in the discussion? Why or why not?
Step 10. What do you conclude about the advantages of one partial sort over the other?
Required files:
Step 1.
Make copies of SortableIntSeq.java and SortTester.java.
Compile the two files. Using SortTester, make five lists of ten
random numbers. Record those lists.
Step 2.
Update SortTester to take a seed as an input. Use
that seed and the appropriate method of SortableIntSeq to
use that seed. Recompile the files and correct any errors. Using
SortTester, make three lists of ten random numbers, using the
same seed each time (do not use zero as our seed). Record your results.
Step 3.
At times, you will want to use presorted sequences instead of random sequences.
Read the code for SortableIntSeq and determine which methods can
be used to generated presorted sequences. What command might you use to
create the sequence [1,3,5,7,9,...,301]? What command might you use to
create the sequence [301,299,297,...,5,3,1]?
Required files:
Step 1.
Make copies of Counter.java, SortableIntSeq.java,
and SortTester.java. Compile all three and execute
SortTester. Using insertionSort, sort a list
of ten numbers. Did it work correctly? Record the original list and the
sorted list.
Step 2.
Update SortTester and SortableIntSeq to count
the number of steps in insertion sort. Recompile the files and summarize
your changes.
Step 3. Use insertion sort to sort ten randomly generated lists of 100 elements. Record the number of steps in each case.
Step 4. Is the number of steps always the same? Why or why not? After answering this question you may want to read our notes on this step.
Step 5. Use insertion sort to sort the lists
Record the number of steps in each case.
Step 6. Is the number of steps always the same? Why or why not? After answering this question you may want to read our notes on this step.
Step 7. Use insertion sort to sort the lists
Record the number of steps in each case.
Step 8. Is the number of steps always the same? Why or why not? After answering this question you may want to read our notes on this step.
Step 9. Reflecting on your experiments, which types of lists is insertion sort best at sorting? Worst at sorting? Why? Is its running time on random lists closer to the best time or worst?
Required files:
Step 1.
Make copies of Counter.java, SortableIntSeq.java,
and SortTester.java. Compile all three and execute
SortTester. Using selectionSort, sort a list
of ten numbers. Did it work correctly? Record the original list and the
sorted list.
Step 2.
If you read the code in SortableIntSeq, you will see that
selectionSort is not defined. Fill in the body appropriately,
recompile SortTester, test the new selectionSort,
and correct any errors. Enter the definition of
selectionSort here. Note that you may want to look at the
definition of fiveSmallest as you define selectionSort.
Step 3.
Update SortTester and SortableIntSeq to count
the number of steps in selection sort. Recompile the files, correct any
errors, and summarize
your changes.
Step 4. Use insertion sort to sort ten randomly generated lists of 100 elements. Record the number of steps in each case.
Step 5. Use insertion sort to sort the lists
Record the number of steps in each case.
Step 6. Use insertion sort to sort the lists
Record the number of steps in each case.
Step 7. Is the number of steps always the same? Why or why not?
Step 8. Reflecting on your experiments, which types of lists is selection sort best at sorting? Worst at sorting? Why? Is its running time on random lists closer to the best time or worst?
Required files:
Step 1.
Using the modified versions of SortableIntSeq and
SortTester, fill in the following table. For each length
sequence, try three different random sequences. Make sure that the two
sorting mechanisms are run on the same random sequences.
Running time of insertion sort and selection sort on different length
random sequences, with three tests per sequence length.
Sequence Steps Steps
length (insertion sort) (selection sort)
Test1 Test2 Test3 Test1 Test2 Test3
100
200
400
800
2000
Step 2.
Using the modified SortTester and SortableIntSeq,
fill in the following table, using sequences of the form [1,2,3,...,n].
Running time of insertion sort and selection sort on different length increasing sequences, with one tests per sequence length. Sequence Steps Steps length (insertion sort) (selection sort) 100 200 400 800 2000
Step 3.
Using the modified SortTester and SortableIntSeq,
fill in the following table, using sequences of the form [n,n-1,n-2,...,3,2,1].
Running time of insertion sort and selection sort on different length decreasing sequences, with one tests per sequence length. Sequence Steps Steps length (insertion sort) (selection sort) 100 200 400 800 2000
Step 4. What do you observe from the tables above? Explain your findings.
Required files:
Step 1.
Make copies of Counter.java, SortableIntSeq.java,
and SortTester.java. Compile all three and execute
SortTester. Using quickSort, sort a list
of ten numbers. Did it work correctly? Record the original list and the
sorted list.
Step 2. Augment the classes to count the number of steps in Quicksort. Recompile the files and correct any errors. Summarize your changes.
Step 3. Run insertion sort and Quicksort on a few lists of different sizes, recording the number of steps.
Running time of insertion sort and Quicksort on different length
random sequences, with three tests per sequence length.
Sequence Steps Steps
length (insertion sort) (Quicksort)
Test1 Test2 Test3 Test1 Test2 Test3
100
200
400
800
2000
Step 4. Plot the results from the previous table.
Step 5. Summarize your findings.
Required files:
Step 1.
Using the modified versions of SortableIntSeq and
SortTester, fill in the following table. For each length
sequence, try three different random sequences, one increasing
sequence, and one decreasing sequence.
Running time of Quicksort on different types and lengths of sequences. Sequence Steps length Rand1 Rand2 Rand3 Inc. Decr. 100 200 400 800 2000
Step 2. What do these results suggest?
a. Develop a Person class in which each object contains
information about a person, including last name, telephone number,
city, state, and zip.
b. Write a program that sorts sequences of Person objects.
You may want to use the compareTo(String other)
method from the String class, which returns a negative number
if the current string is less than the other string.
c. Allow the user to select a "tie-breaking" field by which to distinguish between records that have the same value. For example, you might wish to have last-name ties sorted by first name within each group. Incorporate this tie-breaking method into your program.
d. Note any efficiency issues that arise while implementing these various sorting routines.
a. Develop a PlayingCard class.
b. Develop a Deck class, for decks of playing cards.
c. Create a shuffle method that shuffles a deck of playing cards.
You might shuffle a deck by randomly selecting cards to swap, and doing that some appropriate number of times. Remember that you can use absolute value and the modulus operator to translate a number to a particular range.
You might also shuffle a deck by assigning a random number to each card and then sorting by those numbers.
a. Add a sort method to Deck class
that sorts the cards
in a deck into ascending order. Design your method to report the number of
comparisons performed during the sorting process.
b. Using the methods shuffle and sort, write a
program that reports statistics
(such as the number of comparisons per sort) over numerous shuffles of the deck.
c. How difficult would it be to change your sorting algorithm to, say, descending order, a different suit arrangement, or by sorting the deck into groups of similarly valued card groups?
Write a program implementing the sieve of Eratosthenes for finding the prime numbers between 1 and n. Apply your solution to various values of n. How does the time required by the program increase as n increases? Explain your findings.
Experiment L3.1, Step 5.
If you only count the number of times the body in the loop is executed, it
is likely that the number of steps in smallest is one less than
the number of elements in the sequence.
Experiment L3.4, Step 4. Since the insertions we have to do may differ from sequence to sequence, it is likely that the running times will be different.
Experiment L3.4, Step 6. While the lists are different, they are ordered the same. This means that the number of swaps should be the same.
Experiment L3.4, Step 8. While the lists are different, they are ordered the same. This means that the number of swaps should be the same.
[Instructions] [Search] [Current] [Syllabus] [Links] [Handouts] [Outlines] [Labs] [More Labs] [Assignments] [Quizzes] [Examples] [Book] [Tutorial] [API]
Disclaimer Often, these pages were created "on the fly" with little, if any, proofreading. Any or all of the information on the pages may be incorrect. Please contact me if you notice errors.
This page may be found at http://www.math.grin.edu/~rebelsky/Courses/CS152/99S/Labs/sorting.html
Source text last modified Tue Mar 2 09:25:21 1999.
This page generated on Tue Mar 2 11:18:01 1999 by SiteWeaver. Validate this page's HTML.
Contact our webmaster at rebelsky@math.grin.edu