Algorithms and Computer Programming

Summary:
We discuss the general properties of algorithms and how they are expressed in computer programming languages.

Algorithms

Recall that an algorithm is an orrdered sequence of instructions for solving a problem. There are certain elements that often arise in a wide variety of algorithms; I like to think of these as the fundamental building blocks for creating algorithms. We'll cover five basic building blocks: variables, subroutines, parameters, conditionals, and repetition.

Variables: Named values

We like to name things. You have a name. Perhaps you've named a dog or other family pet. We give things names because it often helps us specify who or what we want to refer to in a clear, compact manner. Whereas our tendency in natural language is to refer to things ambiguously with pronouns like "it" or "they", variables give us the ability to refer to values unambiguously. Another way to think of a variable is like a named box that stores a value. For instance, think of π (pi). This is a name that represents a well-known number, 3.14159. Despite being called a variable, the value of π doesn't vary. The table below demonstrates two variable names and values those names might refer to.

Variable name Value stored
weekday "Monday"
day 23
pi
3.14159

Conditionals: Handling different situations

Sometimes we like our algorithms to do different things depending on whether certain things are true or not. For instance, an algorithm for a robot might move forward if the path is free; otherwise it might turn slightly. Conditionals often have the form:
if something is true:  then do something
As in our robot example above, we might have a richer construction that gives an alternative behavior:

if something is true:  then do something  else: do something else
Sometimes there are more than two possibilities. In that case, there will often be a longer sequence of tests and actions.

Repetition

Sometimes we need our algorithm to do things again and again. For example, when a cell phone receives a call, its software might be programmed to play the ring tone several times (maybe even until the user picks up). It can be tedious or even impossible to explicitly write out repeated commands multiple times, so computer scientists have devised ways to express repeated actions more compactly. Repetition often takes one of the following forms:
  1. Repeat action(s) until a desired condition is met.
  2. While some condition is met, repeat action(s) .
  3. Repeat action(s) a fixed number of times.
In the first case, an algorithm for a cake recipe might dictate to "stir until all ingredients are moistened." As an example of the second case, while you are hungry you might continue eating food at the dining hall. You might have noticed these first two cases seem very similar. The major difference is in when you test the condition, after doing the action the first time (as in the first case, or before doing the action the first time (as in the second case). The last is the most straightforward: "Fold three times."

Subroutines: Named helper algorithms

Many algorithms require the use of some common, or more basic, operations. In fact, most algorithms we write refer to some other algorithms. For instance, in an algorithm that keeps track of how many times a repeated action has occured, we may want to increment, or add one to some counter variable, as in
counter = 1 + counter
This is not a mathematical equation. Rather, the symbol = means "is assigned the value." Recall that a variable is like a named box. Well, we'd read the above expression as "the variable counter is now assigned the value of one plus whatever value the variable counter had before." Even though you know the algorithm for addition, writing it out would be a painstaking, arduous chore everytime we wanted to add numbers in a computer program. Instead, someone else has gone to that trouble and we use + as the name of a helper algorithm representing addition. Whether it's adding numbers, reading input from a keyboard, or searching for a word in a block of text, computer scientists write algorithms for common actions and use them in other algorithms. This allows us to understand algorithms more intuitively at a higher, more abstract level than if every minute detail were explicitly written out at length.

Parameters: Named inputs

For many subroutines, it is useful to provide them inputs that help guide their behavior. For instance, the two numbers two be added would be the inputs to the addition algorithm. We call these inputs parameters. Perhaps you recognize this term from mathematics. When you saw the mathematical expression f(x), you might say that x is the parameter to the function f. In many programming languages, subroutines are also thought of as functions, and the inputs, like x, are parameters that tell us where (or really, how) to evaluate the function/subroutine. The parameters then influence what value the algorithm produces. For example, another very useful subroutine called input might handle all the details of prompting a program's user for input and reading data from the keyboard. Many programming languages indicate a subroutine and its parameters the same way f(x) indicates a mathematical function f and its parameter x. For example:
start = input("Enter a number: ")
In the expression above, we again use the symbol = to assign a value to the variable named start. The procedure input takes some text as a parameter, which is used as a prompt to the program user who presumably types in a value, which is then given to the variable start.

What is a computer program?

We have now covered some of the main building blocks for writing algorithms. Because the chief purpose of an algorithm is to compute the solution to some problem, it is typically useful to express algorithms in a way that can be understood by computers.

Programming languages

Programming languages are designed by computer scientists to express algorithms. Most of these can be "understood" by computers. We need them because English (and other "natural" languages) can be ambiguous, as in Groucho Marx's famous line:
Time flies like an arrow; fruit flies like a banana.
To a programmer, it often seems that computers will misinterpret whatever you write. However, the specification of a particular programming language tells us what computers "hear" when we program them. The syntax of a language tells us its grammar rules. When it comes to programming, computers have no judgment. They cannot figure out what you want. Instead, they give error messages for seemingly small mistakes (like your writing professor who really cares about commas). Perhaps similarly, these messages can be somewhat cryptic, especially to the beginner.1

Program files

Computer programs can be thought of at several levels. Over the next few weeks, we'll be studying all of them. There are two (or three) levels one may think about a program and methods for going from higher, human intelligible versions to the lower, computer-ready versions. We'll look at the levels first and then briefly describe the "translation" process.

The "highest" level (as in, farthest from the computer) of a program is called the source code. Typically this is a text file written by a programmer in a formal language. Just like the informal algorithms we have discussed in class, the file would contain a sequence of instructions expressed more formally. Just like natural languages, there are many varieties of formal languages, each with their relative strengths. Programs in source code form are not executed directly by the computer. Instead, they must be transformed into a lower level.

The next lower level of a program is called assembly code. This form of language expresses an algorithm in the simplest steps that can be directly implemented by a computer. Assembly code is still a text file (i.e., a sequence of bits interprested as ASCII). Thus, it is still human readable but difficult to interpret because there is little abstraction. It is like comparing the instruction to "swing your arm over your head", something akin to source code in an algorithm for throwing a ball, to describing the sequence of neurons in your brain that must fire to accomplish the same task, more like assembly code. Assembly is primarily a mnemonic for machine code, which is also called object or executable code.

Machine code is largely the same sequence of instructions, but written in a binary code that the central processor of a computer can understand directly. As you might guess, these are very detailed instructions and they are generally not interpretable by humans. During World War II, programmers for the earliest computers essentially wrote their programs in machine code.

Fortunately, we now have programs that tackle the conversion of high level source code to low level assembly and machine code automatically. This translation happens in one of two different ways, compiling and interpreting.

Compiling a source code program is a complete translation to object code in advance. One writes the source code and then runs a compiler (which is just another computer program) to generate the corresponding machine code. We can then execute that machine code directly on the central processor whenever it is needed.

In contrast, we could interpret a computer program. In this framework, instructions are translated from source code to machine code just before they are run on the processor. One runs a program called the interpreter, and it translates instructions from source code to machine code which are then immediately executed. The translations are not saved. Interpreted languages are often easier to understand and quickly write programs in. Because the translations happen on the fly, they are also often used interactively. However, because the intermediate translation step is required every time the program is run, interpreted languages may not be as fast as using compiled languages.

In this class, we'll be using Python, an interpreted language, for writing some high level source code. While it is easy to learn, it is still of course a formal language and therefore picky about things.

Acknowledgments

"What is a computer program?" adapted from materials by Marge Coahran. "Algorithms" adapted from materials by Samuel A. Rebelsky. Used by permission.
Copyright 2011-2012 Jerod Weinman, Janet Davis. cc-by-nc-sa.png This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States License.

Footnotes:

1Because these messages are authored by programmers, it simply stresses the importance of good writing-especially writing that considers the audience and its needs.