Traditions in Computing

2009-10-13 11:33

Traditions in Computing

This essay is in reaction to the essay Programmer Superstitions by Jesse Liberty (from 2006)

Technological Traditions

From the outset, our usage of computers and information systems is wholly formed by the designers of those systems. What is, in fact, somewhat remarkable is the unity with which those designs are created, across oceans, cultures, and decades. Some are "obvious," in their correctness, others less so. Many things are done in a particular manner simply because it was the first or simplest way that came to the mind of the designer of the system. Many other aspects of technology are designed to fit the general notions of how that kind of technology — a certain kind of software — is expected to behave. These traditions are mostly nascent and often the learned lore of the industry; few formal instructional programs exist with the intent to teach them.

As technologists, we must be careful to evaluate those traditions in light of the current context; misapplying them can be more damaging than simply ignoring them, as expectations will be set when a particular tradition is invoked, whether by behavior or interface.

At the heart of these traditions are technologists, geeks, hackers, and other highly technical professions — these people have created, embraced, rejected, revised, and updated the various traditions we see today.

Programmer Superstitions

Programmer superstitions are among the most entrenched, whether simply due to a sort of intellectual laziness or something more sinister of these. Programmer Superstitions raises a few points. Most of the points are considered in a vacuum, and in a single-platform, nearly single-language manner. Some examples seem to ignore any technological ecosystem other than the author's native one; this one of the purposes of calling out superstitions, to break out of the universe where we are so entrenched.

Case Sensitivity

I have a gripe with the author's arguments advocating the case-insensitivity of programming languages. He states that the sensitivity of programming languages to the case of the symbols is historical baggage from C that should not be accepted blindly. Not blindly sure, but case-sensitivity was chosen for a reason. At the very least, programmers expect most languages to be case-sensitive — an historical argument to be sure, but the expectation is case-insensitivity. Being an exception is more likely to cause problems.

More importantly, in my mind, is filesystems, command shells, and nearly every other system in a computer is case-sensitive. At least on my computer. Windows filesystems are case-insensitive, but case-preserving, as is HFS on a Mac. This is ostensibly fine, but is tricky to deal with programmatically — program code cannot ignore case in data. The exceptions to default case-sensitivity are, however, very common (e.g. search).

Orthogonality is an important benefit, but could be sacrificed were case-insensitivity a demonstrable win. I have never had a case-sensitive programming language cause undue headache that would have been avoided by ignoring case. Sure, a typos happen, but I am a competent typist and such problems are not so difficult to solve as some may argue. The disadvantage is large: having distinct appearances with the same meaning. Programming languages are one of the most precise ways to express ideas, and the imprecision allowed by case-insensitivity seems additionally burdonsome on an already complex problem.

One other point to make, a social one, is programmers are heavily indoctrinated to preserve data. Throwing away informatin about case is like throwing away the most significant bit of a number. Allowing a variable name, or language keyword to be spelled in a variety of ways allows variations in programming style that can interfere with version control and refactoring tools. Some sort of normalized case is probably used preferentially to any of the combinations of cases allowed. Being able to change nearly every character of a program and have it be functionally identical does not seem a desirable attribute in a programming language.

For more arguments (both ways) on the issue, see this Lambda the Ultimate thread.

Iteration Variables

Another point, that iteration variables in counted loops (like for loops) are often labeled according the mathematical convention of using i, j, k, ..., yet does not make a case for any other convention. He suggests that a and b would "make much more sense" (emphasis added) which is not obvious nor defended. Certainly having an established convention in a related discipline is a sufficient reason? There is no technical reason to call iteration variables any particular name — but many programmers will be familiar with the mathematical custom, especially those with a formal education, so the idiom will be natural and provide meaning where otherwise there would be arbitrariness and individual variation. I don't mean to imply that variables of iteration should always be named in such a mathematical fashion, only when the mathematical analogy is useful. Naming a variable index, count, or any other descriptive name is, of course, ideal in a number of circumstances, especially in deeply nested and complex structures. The simple iteration names are useful for inner loops that have generally uncomplicated structure.

Data Hiding

I do agree with the author that the "object-oriented" principle of data hiding is usually useless and often a noisy and inflexible idiom; though I suspect the author is less concerned with the utility of data hiding than with the discomfort he feels at the lack of it when programming.

The Python language does not have a mechanism to enfore data-hiding, but is a fully capable object-oriented language. (The C# language lacks binary compatibility between properties and object fields, where python does not, so there are technical reasons in some languages for the use of accessors.)

I find I never do things in code that would be useful for the compiler or runtime to have prevented by enforcing data-hiding. Any time I break encapsulation, I leave a big #XXX: hack or similar comment to warn that the design needs to change, or that the code needs refactoring.

Being able to do that (rather than having to do the refactoring at that moment) allows for increased productivity, at the expense of some overhead when the refactoring needs to be done en masse.