Sunday, 1 August 2010

"It's a hardware problem" is a programmer problem

I've heard it said too often by programmers "it's a hardware problem", usually with a bit of back-slapping and joviality, as thought the catch all "hardware problem" descriptor frees the programmer from any responsibility to investigate further. While the spirit of such comments is usually tongue-in-cheek, I can't help but feel it's symptomatic of the view taken of the machinery of computer by too many programmers: that it is somehow magical.

Programmers make a lot of assumptions about hardware, and modern operating systems allow (and encourage!) this. I'm definitely not suggesting that programmers should constantly be shifting to a lower level of abstraction (quite the opposite, I'm all for more abstractions!), but this has to be done with a mind to the fact that abstractions are exactly that - they are abstractions of concrete systems.

Areas where I see the most mistakes, in no particular order:

  • Time
  • Performance
  • Concurrency
Let's examine them briefly.

Time

Oh god. Why does time seem to turn programmers silly? There is a well-known phenomenon in physics: No two clocks are ever the same. Computers obey the same property. Even at relatively large time scales, clocks on two different machines will always be different. Trying to line up time stamps exactly is never, ever going to work. This is assuming you have already managed to account for all the delays in timing caused by software overheads and scheduling delays. Even the hardware clocks are not the same. So please, don't assume that the hardware supplying your calls to system clocks is somehow magical. It's really not.

Performance

Assuming you can predict anything about the real-time characteristics of a piece of software without extensive analysis is the ultimate form of self-delusion, and yet it is one that people regularly seem to engage in. Why, oh why!? Given the amount of instruction re-ordering, instruction-level parallelism and caching that goes on inside a modern CPU, it's worth either understanding exactly what is going on, or treating it as completely non-deterministic. The situation gets more complex when you start talking about the time involved in access to memory or (god forbid!) hard drives.

Concurrency

With the advent of modern commodity multicore hardware, there seems to be a tendency amongst some programmers to think about concurrency in terms of magical hardware, obviating the need to think about all of the extra possibilities for race conditions that this introduces. The hardware is not magical and doesn't do anybody any favours. Only sensible up-front design mixed with an approach that deals with the unpredictability of hardware performance in concurrent situations will yield the kinds of successes we hope for.

Let's banish the magical thinking about hardware from our discipline, huh?