Tuesday, 25 August 2009

I laugh at your puny human 'objects'

Let's say that Earth is being invaded by aliens. The last tatters of the US government (and a motley crew of scientists and caravan enthusiasts) are holed up in a secret government base in the desert, where a crashed alien ship is stored. One of the dashing action heroes realises that he can engineer a computer virus to take out the alien shields that are preventing any meaningful counter-attack. Our brave hacker whips up some code, which must be delivered to the alien mothership using the (now-repaired) alien fighter. They sneak into the mothership undetected and upload the virus.

At the moment of truth, as they are about to detonate a nuclear device and make their escape, the virus program throws an exception because the behaviour of the 'AlienMothership' class differed slightly from our hero's interpretation, and the aliens succeed in taking over the world.

OK, so I ripped most of that off from Independence Day, but I think it goes to the heart of my least favourite technology: objects.

I'll freely admit that this is a bit of a rant. You're entitled to disagree.

Objects are a bad model for the universe. There, I said it. Object Oriented Programming (OOP) is oversold as a "natural" way of modelling things. It really isn't. Let's examine why:

Let's start with encapsulation. I have no problem with encapsulation, as such. It's a great idea. Except when it isn't. When you have hidden (mutable) state, I think it's pretty much a terrible idea. You define a nice tidy interface (or class definition, or whatever terminology you prefer), and then call some methods from another location. At any given moment, that object is in some state, and the meaning of calling another method on that object could be completely different. X.a() then X.b() may be completely different to X.b() then X.a(). This implied ordering is not really expressed anywhere. There might be an in-source comment or a note in the documentation that mentions it, but really, you're on your own here.

I'm not talking about incorrect behaviour here either, where it would be possible to check (at runtime) that a call to X.a() never occurs after X.b(). I'm referring to a situation where both execution orders are perfectly valid, but have totally different meanings. How can I resolve this? Well, naturally, I just look inside the class and figure out how internal state is modified and what the behaviour will be depending on the internal state. Whoops! I'm basically back in imperative-land.

To summarise, I think hiding behaviour inside objects makes no sense. "Hiding" computation away is basically ignoring the point of the program. Abstracting away irrelevant computation makes sense, but OOP takes this paradigm way too far, hiding away both irrelevant computation and the computation we are interested in.

I asserted earlier that I thought objects were a poor model for the universe. I'll try to elaborate on this claim.

The classic OO examples always feature things that are somewhat modular in their design, interacting in terms of clearly defined events, with some internal state. For example, one might model a bicycle with wheels and a 'pedal()' method. Or a 'Point' class that perfectly captures the fact that a point has x and y coordinates, with some methods to handle translation or rotation or any other interesting operation.

But a 'point' is just data. Why does it have behaviour? In this case we're basically just using objects to control name spaces. It's nice to be able to have the word 'point' appear in operations on points, and have all of the functionality associated with a point in one place. Great, so let's do that, but instead of hiding the state, let's have a 'Point' module with no internal state, which gathers up all of the point-related functionality that operates on values of type 'point'.

A bike won't do anything without a person to pedal it. A bike makes good sense from an OOP point of view. We can have some internal state, recording things about the bike. It's position and maybe its speed and direction of travel. The "pedal()" and "turnHandleBars()" methods alter this internal state. But a bike exists in the real world. The "world" records the position of the bike, not the bike itself. You can't pick up a bike in the real world and ask it where it is, or where it is going. A real bike stores none of this information, rather the context in which the bike exists tells us all of that information.

So when we model a 'Bike' in an OO context, we're actually talking about a bike in a particular context. The claimed decoupling just hasn't happened. It's unlikely that any natural implementation of a bike could be picked up and re-used in a context where different world-assumptions exist. Perhaps we want to ride a bike on the moon! It seems unlikely we'd be able to do that without going back and changing some internal 'gravity' parameter inside the object. When we model the world this way, we are encoding all sorts of assumptions about the world into the object itself.

If we do the 'right thing' here and remove any world-specific state from the Bike class, what have we got left? In this case, essentially an empty object that simply records the fact that a bike exists. Why do we need an object to do that?

Then there's the question of active and passive behaviour. A bike does not move itself. A person does. A method in the 'Person' class might call the 'pedal' method on the Bike class. This is a relatively sane operation. Calling methods on a Person becomes a much stranger kind of game. Obviously this is a modelling task, so we're abstracting away from the real world, but there is only one kind of method-calling, which models the interaction between a passive system and an active one. Where is the thread of program control? What about active objects calling active objects? Where do 'utility' functions fit in? They have no real-world equivalent, but they still get shoe-horned into the OO paradigm. Don't even mention concurrency in this situation - that's just a horror show waiting to happen.

This isn't just irrational OOP-hatred, by the way. I used to do a lot of programming in Java and C++. These days I use SML. I've found it to be a vast improvement on the situation, and here is why:
  • Records, Tuples and recursive data types fulfil most of the "structured data storage" uses of objects.
  • The module system provides the kind of name-space control and "bundling" for which people use classes.
  • Referential transparency means you can understand easily the meaning and purpose of a given function that operates on data. There is no "hidden state" which modifies this behaviour in unpredictable ways.
  • The "control" of the program follows one predictable path of execution, making mental verification considerably easier.
  • Important computation isn't inadvertently "hidden away". If computation happens, it is because there is a function that does it explicitly.
  • Clear distinction between "passive" data and "active" computation.
  • No hidden state means it is much harder to unintentionally model the object of interest and its context together.
The only thing that object orientation does for most programs is to provide basic name-space control. I don't think it significantly helps maintainability or readability. There are much better ways to do name-space control than wheeling out a giant, unwieldy object system that everything has to be crammed into.

I think the problem essentially exists as a result of static modelling of the universe. If you freeze a moment in time, then sure, everything does look like objects. But we're interested in behaviour and time and computation, none of which are captured very well by bolting on some "behaviour" to an essentially static model of a system. And sadder still, instead of putting the behaviour somewhere sensible, it gets wedged into the static models themselves!

There are a few other things I really dislike. Inheritance, for example. this post is getting really long though, so I'll stop here with the comment: If I was fighting the alien hoards, it wouldn't be with objects.


  1. There's way too many strawman arguments and false dichotomies in this. Particularly your example of the bike, which seems to be an exercise in picking a flawed OO representation and shooting that down in attempt to shoot down OO concepts. In what context would you possibly define gravitational parameters in the bike object?

    If in removing world specific state from the bike object it becomes "empty" then it might as well not exist in its own terms: it would be a mistake made in the modelling stage to have modelled it explicitly. However, removing world state from the bike object would not necessarily render it empty either, it depends on the scope of the problem you're modelling.

    If you have a point, I'm afraid it's lost beneath some really sketchy arguments.

  2. How would you model the relationship between calling the "pedal" function and some update of the internal state of the bike? Pedalling the bike and the corresponding change in its state with respect to its environment must either exist inside the Bike, or inside the environment. Which is it?

    The use of a 'Bike' is a textbook OOP example, hence why I used it.

  3. Actually I thought the post was itself a real-world example of abstraction - but because of the context I was able to follow Gian's argument easily. We all operate in a real world and this really needed saying. And I say this with the benefit of over 10 years successful OOP experience. Lets stop drinking the Koolaid and consider appropriate technology.

  4. I think you make a valid point. The OOP paradigm is not always perfect and does not fit every problem or situation.

    As always it's about using the best tool for the job (instead of transforming everything into a nail to fit that hammer of yours).

    For the moment I find Python to be a very versatile tool that actually lets me model my problem in ways that suits the problem (insted of the language).

  5. The point is: abstraction and encapsulation are better made with a decent module system. Inheritance is useless when functions are as flexible as integers. Class polymorphism loses much of its interest when you have algebraic datatypes.

    OO doesn't suck that bad. We just know better.

  6. »At any given moment, that object is in some state, and the meaning of calling another method on that object could be completely different. X.a() then X.b() may be completely different to X.b() then X.a(). This implied ordering is not really expressed anywhere.«

    Regarding lets say some C++ OOP impl, I disagree. It is expressed in the very nature that an action on an object MIGHT change it. When I say number.add(1) and number.pow(3) its clear the order matters and thats what has to be explicitly assumed for all member functions.

    Also, I recall the const member function qualifier which can exactly solve your problem: const => non-mutating; not-const => mutating. Non-mutating actions could/should commutate. Also of course static functions..

  7. Sounds like you want "Entity Systems" (as defined by how game developers use the term).





  8. Arr, sorry, damn links didn't paste correctly... :-(





  9. "You can't pick up a bike in the real world and ask it where it is, or where it is going. A real bike stores none of this information, rather the context in which the bike exists tells us all of that information."

    I get your point that bikes aren't very talkative. But neither is the world through which the bike moves.

    The bike still "knows" it's location. Ask it. "Where are you?" The blank stare you get from the handlebars means "Right here." That doesn't seem all that useful, but call increaseVelocity() on that bike and then ask it a more complicated question, "Where will you be in the future?" and it will answer "over there."

    Anthropomorphism aside, the bike has an internal state which is why it doesn't have to guess and say "the moon" or "the bottom of the ocean." Sure you can ask similar questions of the context, "Street, Is there a bike on you?" but that's just an issue of relativity not of where where variables are stored.

  10. tl; dr. On the topic of modeling the real world, however, you might find this interesting: http://en.wikipedia.org/wiki/Antiobjects

  11. OO is not the perfect paradigm, but you need to improve your critique or it will be just a misguided rant.
    e.g. regarding ordering: is required for things like IO. Why do you think monads are so important in Haskell?