Sorting a collection of records usually involves inspecting one field of this record, called the key, and arranging the records so that their keys are in ascending order. The keys may be natural numbers -- serial numbers, perhaps -- or they may be short strings of digits or other characters, such as the letters of a person's surname or the digits of a ZIP code.
If a key is a short array of characters or other values, or if, like a serial number, it can easily be converted into such an array, it is often possible to use a sorting method known as radix sorting. In radix sorting, one sets up a queue for each possible value of a component of the key. (For instance, if sorting by ZIP codes, one would set up ten queues, one for each digit from 0 to 9.) Then one distributes the records into these component queues by examining the last or least significant component of their keys (so that, for instance, all of the records with ZIP codes ending in 0 would be placed in the 0 queue, all those with ZIP codes ending in 1 in the 1 queue, and so on). Next, one reconstructs the full collection by taking all of the elements in the 0 queue, then all of the elements in the 1 queue, and so on in order; the result is a master queue in which the records are sorted by their last digit.
The next step is to redistribute the records into the component queues, this time according to the next-to-last component of the key, and to reconstruct the master queue from the component queues in the same way as before. Since the distribution process is stable, in the sense that it will not change the order of records with equal keys, the resulting master queue is correctly sorted by the last two digits of the key.
By repeating the distribution and reconstruction steps for each component of the key, from least significant to most significant, one eventually obtains a completely sorted master queue. (If one is sorting by five-digit ZIP codes, for instance, five cycles of distribution and reconstruction are needed.)
Here is an HP Pascal procedure that implements this algorithm:
const
key_size = { the number of components in a key };
least = { the least possible value for one component of a key };
greatest = { the greatest possible value for one component of a key };
type
component = least .. greatest;
key_type = array [1 .. key_size] of component;
rec = record
key: key_type;
{ presumably other fields as well }
end;
procedure radix_sort (var master: queue);
var
val: component;
{ runs through the possible values of a component of the key }
small_queue: array [component] of queue;
{ a queue for each of those possible values }
position: 1 .. key_size;
{ runs through the positions of the components within a key }
item: rec;
{ one item at a time from the master queue }
begin
{ Set up the component queues. }
for val := least to greatest do
small_queue[val] := create_queue;
{ Run through a cycle of distribution and reconstruction for each
component of the key. }
for position := key_size downto 1 do begin
{ Distribute items from the master queue into the component queues. }
while not is_empty_queue (master) do begin
item := dequeue (master);
enqueue (item, small_queue[item.key[position]])
end;
{ Reconstruct the master queue. }
for val := least to greatest do
while not is_empty_queue (small_queue[val]) do
enqueue (dequeue (small_queue[val]), master)
end;
{ Recycle the (empty) component queues. }
for val := least to greatest do
deallocate_queue(small_queue[val]);
end;
This implementation presupposes the existence of the five basic queue
functions create_queue, is_empty_queue,
dequeue, enqueue, and
deallocate_queue. Here is a module that provides them,
implementing them in terms of singly-linked lists with a header containing
pointers to the first and last components:
{ This module defines an interface for a queue data type and implements it
for HP 9000 Series 700 workstations under HP-UX 9.x, using HP Pascal.
Programmer: John Stone, Grinnell College.
Original version: April 18, 1996.
}
{ The dispose() procedure does not actually recycle storage unless the
HEAP_DISPOSE compiler option is turned on. }
$heap_dispose on$
module queues;
export
const
key_size = 9; { for U. S. Social Security numbers, say }
least = '0';
greatest = '9';
type
component = least .. greatest;
key_type = array [1 .. key_size] of component;
rec = record
key: key_type;
{ presumably other fields as well }
end;
element = rec;
queue = ^queue_header;
{ The create_queue function constructs and returns an empty queue capable
of any number of elements. }
function create_queue: queue;
{ The is_empty_queue function determines whether a given queue is
empty. }
function is_empty_queue (q: queue): Boolean;
{ The dequeue function extracts the oldest element from a non-empty queue
and returns it. It is an error to give an empty queue as the argument
to dequeue. }
function dequeue (var q: queue): element;
{ The enqueue procedure adds an element at the end of an existing
queue. }
procedure enqueue (item: element; var q: queue);
{ The deallocate_queue procedure recycles all the storage associated with
a given queue, leaving its argument undefined. }
procedure deallocate_queue (var q: queue);
implement
import
stderr;
const
{ The following constants are more or less arbitrary integers
signifying various kinds of exceptions that can occur within this
module. }
FIRST_EXCEPTION_CODE = 1;
DEQUEUE_EXCEPTION = 1;
EXCEPTION_EXCEPTION = 2;
LAST_EXCEPTION_CODE = EXCEPTION_EXCEPTION;
type
link = ^queue_component;
queue_component = record
datum: element;
next: link;
end;
queue_header = record
front, rear: link
end;
{ The queue_handler procedure, which is not exported, is invoked
whenever one of the preconditions for the successful execution of a
procedure is found to be false. It prints out an appropriate
explanation of the exception just before the program is halted. }
procedure queue_handler (exception_code: integer);
begin
if (exception_code < FIRST_EXCEPTION_CODE) or
(LAST_EXCEPTION_CODE < exception_code) then
exception_code := EXCEPTION_EXCEPTION;
write (stderr, 'Exception #', exception_code : 1,
' in module QUEUES: ');
case exception_code of
DEQUEUE_EXCEPTION:
writeln (stderr, 'empty queue as argument to function DEQUEUE ');
EXCEPTION_EXCEPTION:
writeln (stderr, 'argument out of range in procedure ',
'QUEUE_HANDLER.');
end
end;
function create_queue: queue;
var
result: queue;
{ the queue that is constructed }
begin
new (result);
result^.front := NIL;
result^.rear := NIL;
create_queue := result
end;
function is_empty_queue (q: queue): Boolean;
begin
is_empty_queue := (q^.front = NIL)
end;
function dequeue (var q: queue): element;
var
old_link: link;
{ a pointer to the component to be removed from the queue }
begin
assert (q^.front <> NIL, DEQUEUE_EXCEPTION, queue_handler);
dequeue := q^.front^.datum;
old_link := q^.front;
q^.front := old_link^.next;
if q^.front = NIL then
q^.rear := NIL;
dispose (old_link)
end;
procedure enqueue (item: element; var q: queue);
var
new_link: link;
{ a pointer to the component to be added to the queue }
begin
new (new_link);
new_link^.datum := item;
new_link^.next := NIL;
if q^.rear = NIL then
q^.front := new_link
else
q^.rear^.next := new_link;
q^.rear := new_link
end;
procedure deallocate_queue (var q: queue);
var
traverser: link;
{ a pointer to successive components of the underlying linked list }
trailer: link;
{ a similar pointer, lagging one component behind traverser }
begin
traverser := q^.front;
while traverser <> NIL do begin
trailer := traverser;
traverser := traverser^.next;
dispose (trailer)
end;
dispose (q);
q := NIL
end;
end.
A much faster implementation of the radix sort can be obtained by
manipulating the link pointers directly; for instance, instead
of using dequeue and enqueue to transfer records
from the component queues into the master queue, one could rebuild it by
linking the last item in each component queue to the first item in the
next. However, the handling of the special case that arises when some of
the component queues are empty obscures the working of the radix-sorting
algorithm, so the slower but simpler version is presented here.
This document is available on the World Wide Web as
http://www.math.grin.edu/~stone/courses/fundamentals/radix-sorting.html