Sunday 9 August 2009

When are we going to stop writing code?

Programmers, generally speaking, like writing code. It seems obvious, but it's important to the point I would like make.

Software defects arise from writing code. Sure, there are classes of errors which arise as a result of programmers or stakeholders actually just getting requirements or specifications wrong, but mistakes in understanding or requirements only manifest once they are translated (by flawed, fallible humans) into something executable.

So a very simple way to greatly reduce the number of defects that exist in our software seems to be to stop humans writing code. By pushing more of the work into compilers and tools (which we can verify with a high degree of confidence), we reduce the areas where human error can lead to software defects.

We're already on this path, essentially. Very few people write very low-level code by hand these days. We rely on compilers to generate executable code for us, which allows us to work at a higher level of abstraction where we are more likely to be able to analyse and discover mistakes without needing to run the program.

Similarly, type systems integrated with compilers and static analysis tools remove the burden on us as programmers to manually verify certain runtime properties of our systems. Garbage collectors remove humans from the memory-allocation game altogether.

See what I'm getting at? We have progressively removed bits of software development from the reach of application developers. Similarly, the use of extensive standard libraries packaged with mainstream programming languages (hopefully!) no longer requires programmers to create bespoke implementations of often-used features. The less code a programmer writes, the fewer chances he or she has to introduce errors (errors in library implementations are a separate issue - however a finite amount of code re-used by many people is likely to be much better over time than a piece of code used by one implementation).

The rise of various MVC-style frameworks that generate a lot of boilerplate code (e.g. Ruby on Rails, CakePHP, etc.) further shrinks the sphere of influence of the application developer. In an ideal world, we would be able to use all of these sorts of features to ensure that we essentially just write down the interesting bits of our application functionality, and the surrounding tools ensure global consistency is maintained. As long as we can have a high degree of confidence in our tools, we should be producing very few errors.

There is one basic problem: it doesn't go far enough.

Despite their best intentions, Ruby on Rails and CakePHP are basically abominations. I speak only of these two in particular because I've had direct experience with them. Perhaps other such frameworks are not awful. The flaws in both frameworks can essentially be blamed on their implementation languages, and the paradigm that governs their implementations. Without any kind of type safety, and with very little to help the programmer avoid making silly mistakes (e.g. mis-spelling a variable name), we can't really have a high degree of confidence in these tools.

Compilers, on the other hand, are generally very good. I have a high degree of confidence in most of the compilers I use. Sure, there are occasional bugs, but as long as you're not doing safety-critical development, most compilers are perfectly acceptable.

So why are there still defects in software? First, most new developments still use old tools and technologies. If any kind of meritocracy was in operation, I would guess that very few new things other than OS kernels and time-critical embedded systems would be written in C, but that's simply not the case. Many things that make us much better programmers (by preventing us from meddling in parts of the development!) are regarded as "too hard" for the average programmer. Why learn how to use the pre-existing implementation that has been tested and refined over many years when you can just roll-your-own or keep doing what you've always done? Nobody likes to feel out of their depth, and clinging tight to old ideas is one way to prevent this.

Having done quite a bit of programming using technologies that are "too hard" (e.g. I'm a big fan of functional programming languages such as ML and Haskell), I think that if you use these technologies as they are designed to be used, you can dramatically reduce the number of defects in your software. I know I criticised methodology "experts" in my previous post for using anecdotal evidence to support claims, but this isn't entirely anecdotal. A language with a mathematical guarantee of type safety removes even the possibility of deliberately constructing programs that exhibit certain classes of errors. They simply cannot happen, and we can have a high degree of confidence in their impossibility. As programmers, we do not even need to consider contingencies or error handling for these cases, because the compiler will simply not allow them to occur. This is a huge step in the right direction. We just need more people to start using these sorts of approaches.

So, the title of this post was "When are we going to stop writing code?", and I ask this with some seriousness. As we shrink the range of things that programmers are responsible for in software development, we shrink the set of software defects that they can cause. Let's keep going! I believe it is very nearly within our reach to stop writing software and start specifying software instead. Write the specification, press a button and have a full implementation that is mathematically guaranteed to implement your specification. Sure, there may be bugs in the specification, but we already have some good strategies for finding bugs in code. With a quick development cycle, we could refine a specification through testing and through static analysis. We can build tools for specifications that ensure internal consistency. And as in the other situations where we have been able to provide humans with more abstract ways to represent their intentions, it becomes much easier for a human to verify the correctness of the representation with respect to their original intentions, without the need to run a "mental compiler" between code and the expected behaviour. This means we can leave people to solve problems and let machines write code.

That said, it's probably still not realistic for people to stop writing code tomorrow. The tools that exist today are far from perfect. We're still going to be forced to write code for the forseeable future. We can get pretty close to the Utopian ideal simply by using the best tools available to us here and now, and in the mean time, I'm going to keep working on writing less code.

9 comments:

  1. Are you getting at Formal Methods and associated languages (such as Z)? The tools do already exist, but they are specialist. The skillsets are also not widespread since they are not widely taught (or used after they are taught in advanced Computer Science courses). The application domains, to date, are limited to military, aerospace and medical, for the obvious reasons.

    ReplyDelete
  2. Nice post and I agree completely. There are efforts underway to do what you're suggesting -- essentially making formal methods compile. Can you introduce yourself more?

    ReplyDelete
  3. Paul, yes, I'm kinda getting at "formal methods" being the way forward (FM is how I pay my bills). However, I was trying to avoid using the "f word", because what I'm advocating is essentially moving away from all non-formal methods! I see the big challenges of the future of programming languages as relating to the ability to bring niche formal methods into the mainstream. Projects such a B are a good start - they can code generate Java off the back of a specification through stepwise refinement, however the overhead is still too big for most application development (outside of the domains you mentioned). If we can bring these sorts of tools and methods to within the reach of everyday applications, I think the world could be a considerably happier place.

    David, thanks. I'm aware of a few of the efforts, and I've been involved in a few of them. As for me, I'm a computer science graduate. I went to The University of Waikato in New Zealand. I now live in the UK, where I work for a company in Central London. I'm being deliberately vague here, as I wouldn't want anyone to get the idea that my not-necessarily-well-thought-out opinions constitute the opinions of my employers :)

    ReplyDelete
  4. There's also the perverse motivation to do things "the hard way". I know a few programmers who will still defend using C for everything because it's hard, and it makes them feel manly...

    ReplyDelete
  5. Precisely, blackdog! And that's absolutely fine for hobbyist projects and for educational purposes. I have nothing but respect for the people who build computers out of mechanical relays and who write things in M68K machine code "just for fun". I've certainly done some bizarre things "the hard way" for my own satisfaction.

    The problem with that entire approach, as I'm sure you know, is that people often feel the need to get their kicks when writing software that they are going to inflict upon other people.

    ReplyDelete
  6. Sorry I can't agree. I want the sharpest tool that I can get. What your talking about IMHO is restricting the scope of programmers.

    I ask, what are you going to do when the limited scope that you have set for programmers meets a context in which the scope doesn't cover?

    I totally believe in using tools that remove accidental complexity from programming. Using static versus dynamic typing to program does not resolve the accidental complexity.

    A programmer should be programming at the correct abstraction level.

    ReplyDelete
  7. Perhaps that's an ideological point. I would rather go to a little extra effort in order to formulate my "programs" in such a way that I get the assurances that it will work. I'm not sure how that's any more restrictive than the requirement that you use a fixed set of constructs in a programming language to construct a program. Everyone happily shoe-horns every problem and solution into objects and they get along fine!

    Also, I really think that strong, static typing _does_ reduce complexity. There can be no unchecked runtime type errors, and there is no possibility for me to make mistakes while writing type-checking code, or doing unchecked conversions. This is a _huge_ class of errors, and in conjunction with sensible use of garbage collection and conservative use of mutable storage, you get a guarantee that your program will not crash.

    So if we are comparing complexity, it seems like a program that works all the time is actually considerably less "accidentally complex" than one that exhibits unintended behaviour because of bugs introduced in parts of the program that are tangential to the actual problem being solved (e.g. memory allocation, casts, type coercion etc).

    ReplyDelete
  8. "Programs must be written for people to read, and only incidentally for machines to execute."

    - Abelson & Sussman, SICP, preface to the first edition

    Most changes to software occurs in the maintenance phase (brown-field) not in the initial write (green-field). I would also go so far and say that most programmer man-hours are spent in brown mode.

    To me static typing gets in the way of the reading. Look at this code (C#) on my blog:

    http://uglycode.wordpress.com/2009/08/11/extension-methods-or-a-closure-why-not-both/

    And here it is in Common Lisp:

    (defun lisp-print (prefix)
    (format nil "~A Executing" prefix))

    (defmacro looper (count fn)
    `(loop for i from 1 to ,count
    collect ,fn))

    (list (looper 1 (lisp-print "Start"))
    (looper 5 (lisp-print "Middle"))
    (looper 1 (lisp-print "End")))

    Which is easier to read?

    If you start SPECIFYING your software, static typing will become noise in the way.

    You might want to check this blog post on abstractions:
    http://apocalisp.wordpress.com/2009/04/27/a-critique-of-impure-reason/

    I have one final question, what is your stance concerning metaprogramming? Is it a necessary programming technique or is it magic?

    ReplyDelete
  9. I'm really not a fan of "readability" arguments except as a personal value-judgement. With good type-inference (such as HM-style inference), there are almost no type annotations present in the code. I think that is quite readable :)

    Also, I should really point out that people are already doing most of what I was advocating. People already do specify software in formal and semi-formal ways. The system works! It just needs to be brought within the reach of ordinary programmers.

    Metaprogramming seems like a good idea, although I can't say I've had a lot of cause to use it.

    The Ur/Web system is one case where I think metaprogramming is being applied to very good effect.

    http://www.impredicative.com/ur/

    ReplyDelete