Hacking gadflies

Herman, Dave. “Nancy typing.” The little calculist, February 23, 2006.

Summary: A cautionary tale to illustrate the point that overly aggressive type coercion produces exotic, hard-to-diagnose bugs:

My friend Mike MacHenry told me this story. Mike's company has an expert system, written in Perl, that employs heuristics to determine whether a customer is likely to have filled out a form incorrectly. One of the heuristics is supposed to check that the names given in two different columns of a record are the same. Now, it turns out that the comparison was implemented wrong--the software was almost always reporting that the names were the same, even when they weren't--but because the expert system is only making a best guess, this just passed unnoticed for a while as a false negative.

But every once in a while, the system reported a false positive, i.e., that the names were not the same when in fact they were. And it just so happened that in each one of these cases, the name in the record was “Nancy.”

The explanation is that the programmer had accidentally used the numeric equality operation, rather than string equality. Perl silently converted the strings to numbers in order to make the operation succeed. Since in most of the cases, the strings were not numeric, Perl parsed the maximum numeric prefix of the string, namely the empty string. It so happens that the empty string happily converts to 0 in Perl, so most of the time, any pair of strings both convert to 0, and the equality comparison evaluates to true.

But “Nan” is parseable in Perl as not-a-number, which is always considered unequal to any number, including itself. So “Nancy” happily converts to not-a-number, and then tests false for numeric equality to itself.