Adventures in Programming Languages and Semantics

Sunday, 21 October 2012

The average value of an idea is (almost) zero

An idea is worth nothing. Zip. Nada. If you already know this, do not read further. There is no new wisdom contained in this post. I frequent a popular startup-related IRC channel, and about once per day someone will join telling us they've got an "awesome idea" that is bound to change the world. The only problem is that they are unwilling to tell anyone that idea, because they are worried someone will steal it.

Usually after about 20 minutes of someone patiently explaining why this is absurd, they see that sharing an idea with a group of people is actually a very low-risk activity, because ideas have very little intrinsic value. This post is to avoid needing to explain this afresh every single time.

The first reason ideas are worth (almost) nothing, is that everybody has them. All the time. The problem is that most of them just aren't good ideas. This also reduces the average value of an idea, in the sense that there are just so many of them available. If you talk to a lot of entrepreneurial types, they will tell you that they have far more good ideas than they could possibly execute. This gives us point 1:

Point 1: Most people have lives and jobs and commitments. They probably don't have the time or inclination to steal your idea and put in the work to execute it.

If you are convinced that fortune has smiled upon you in delivering one of the ever-so-rare *good* ideas to you, then you still don't have a good reason to value your idea. The reason is simple: if someone else can out-execute you after hearing the idea for the first time, despite the fact that all of the work lies ahead of that point, then it probably wasn't a great idea for you to attempt to turn into a business, because it offered you no proprietary advantage whatsoever. I've prepared this helpful chart:

The value of an idea increases very slowly, even as you invest time in it. The gains in (relatively, subjective) value early on are mostly tiny as you mull over your idea and test it against some mental model of the world. As you move further towards actually executing your idea, you might start talking to people and receiving advice, or you might start to pay extra-close attention as you go about your day to find relevant tidbits of information that help you to validate your idea at this early stage. This is the first segment of the chart. However, all of this work can be replicated relatively easily, and therefore it has very little objective value. Anyone who hears your idea is still at the very start of this process, in terms of the investment of time and effort that will be required to turn it into something worth anything at all.

Point 2: Thinking time is cheap and plentiful, and you're not giving much away by saving someone a few {minutes, hours, days} of thinking about something.

I'll hasten to clarify here that there's a difference between sharing an idea and sharing details of an idea. If you've invented a wonderful new chemical that turns puppy poo into gold, then you can communicate this by saying only what it does. You don't need to give away the chemical formula!

Point 3: If you have a proprietary advantage, tell people what your idea does, not how it does it.

Then you move towards actual execution, in which you start building, talking, producing --- whatever it is that you do to create actual value in your business. This is always going to be hard work. Expertise and domain-knowledge will change the slope of this curve, but the area under it is going to be the same. Your "first mover" advantage is almost immediately gone if someone else can come along and out-execute you merely by having heard your idea, and if they can out-execute you when you've already had a head-start, they can probably out-execute you with an even bigger head-start. Your idea probably wasn't a good one, because you were not the best-suited to execute it well if merely hearing the idea permits someone else to execute it better.

Point 4: If you can be out-executed due to a lack of a proprietary advantage, it doesn't matter whether it happens now or later.

Another example: In contrast to the puppy-poo-into-gold invention earlier, by which having invented something proprietary gives you a clear advantage over any potential competitors, let's consider a counter-example.

"Hey bro, I've got this great idea! I'm going to make a site like Facebook, but it'll be green instead of blue!".

I wouldn't want to claim this is a terrible idea off-hand (the fact that people insist on dyeing beer green on St Patrick's day might indicate that it has merit). However, it is demonstrably not a good idea because the "idea" part creates exactly no extra barrier to entry. Everybody who can build a website is capable of making it green instead of blue. Last, one must consider that if for whatever insane reason our hypothetical "green Facebook" turned out to be a success, then Facebook itself could simply switch its color scheme to green, and your commercial advantage would disappear in a stroke of CSS.

In summary: If by hearing your idea, I can immediately go out and make it a viable business out of it before you can, and if me doing so would prevent you from making it a viable business, then it was not a good idea in the first place. Tell people your ideas, because the feedback you get is worth a lot more than any minuscule first-mover advantage you might have on a genuinely good idea.

Wednesday, 13 April 2011

NZ Copyright Amendment Letter

I don't usually post non-PL-related things here, but there is some legislation brewing in New Zealand that is deeply unsettling to me, and so I'm publishing here the letter I wrote to my local Member of Parliament for the area which I cast my vote in:

Dear Mr Bennett,

I am writing to you as a constituent to express my deep concern for both the substance of the Copyright (Infringing File Sharing) Amendment currently before the house, and for the procedure by which this legislation has reached its current status.

The use of Parliamentary urgency under the auspices of assisting with the aftermath of the Canterbury earthquake to advance this controversial and flawed legislation is deeply disturbing to me. It reflects a perversion of the democratic process in which due time and consideration is given to the will of the people. This attempt to rush legislation with no particular time pressure attached to it through the house reflects poorly upon the dignity and legitimacy of our democracy.

Furthermore, the tone and substance of the debate in the house suggested that the majority of the MPs are not well-informed about the complex technical, legal and social implications of this legislation, particularly the "disconnection clause" that gives an extra-judicial remedy to intellectual property owners who are not required to submit to the usual burden of proof as in any other civil action. Given the fundamental nature of the internet within New Zealand business, education and personal life, this remedy that may be applied in the presence of a mere accusation seems disproportionate and set to disadvantage both the businesses that provide internet services within New Zealand, and the consumers of internet services who rely on them for so many facets of daily life.

As a doctoral student in Computer Science, the many technical and social issues that remain unaddressed by this legislation are perhaps more worrying to me than to those members of Parliament currently advancing this legislation. I (and many others within the internet community in New Zealand) feel that this legislation, while addressing a perceived problem, has not been adequately discussed and refined to the point where it could be effective in addressing the purported issues. This is another reason why attempting to effect this legislation so quickly is inappropriate.

I hope that you will take my genuine concerns in the spirit that they are given, and that you can be persuaded to oppose both the process and substance of this legislation until such a time as the full implications can be considered and discussed by the public at large, and effective, fair legislation can be drafted that will encourage the development of healthy creative industries and a healthy internet industry within New Zealand.

Yours sincerely,

Gian Perrone, B.Sc (Hons), M.Sc (Hons)

As an aside, be warned that if this legislation passes, anyone reading this letter on this blog is probably guilty of *gasp* file sharing!

Sunday, 1 August 2010

"It's a hardware problem" is a programmer problem

I've heard it said too often by programmers "it's a hardware problem", usually with a bit of back-slapping and joviality, as thought the catch all "hardware problem" descriptor frees the programmer from any responsibility to investigate further. While the spirit of such comments is usually tongue-in-cheek, I can't help but feel it's symptomatic of the view taken of the machinery of computer by too many programmers: that it is somehow magical.

Programmers make a lot of assumptions about hardware, and modern operating systems allow (and encourage!) this. I'm definitely not suggesting that programmers should constantly be shifting to a lower level of abstraction (quite the opposite, I'm all for more abstractions!), but this has to be done with a mind to the fact that abstractions are exactly that - they are abstractions of concrete systems.

Areas where I see the most mistakes, in no particular order:

Time
Performance
Concurrency

Let's examine them briefly.

Time

Oh god. Why does time seem to turn programmers silly? There is a well-known phenomenon in physics: No two clocks are ever the same. Computers obey the same property. Even at relatively large time scales, clocks on two different machines will always be different. Trying to line up time stamps exactly is never, ever going to work. This is assuming you have already managed to account for all the delays in timing caused by software overheads and scheduling delays. Even the hardware clocks are not the same. So please, don't assume that the hardware supplying your calls to system clocks is somehow magical. It's really not.

Performance

Assuming you can predict anything about the real-time characteristics of a piece of software without extensive analysis is the ultimate form of self-delusion, and yet it is one that people regularly seem to engage in. Why, oh why!? Given the amount of instruction re-ordering, instruction-level parallelism and caching that goes on inside a modern CPU, it's worth either understanding exactly what is going on, or treating it as completely non-deterministic. The situation gets more complex when you start talking about the time involved in access to memory or (god forbid!) hard drives.

Concurrency

With the advent of modern commodity multicore hardware, there seems to be a tendency amongst some programmers to think about concurrency in terms of magical hardware, obviating the need to think about all of the extra possibilities for race conditions that this introduces. The hardware is not magical and doesn't do anybody any favours. Only sensible up-front design mixed with an approach that deals with the unpredictability of hardware performance in concurrent situations will yield the kinds of successes we hope for.

Let's banish the magical thinking about hardware from our discipline, huh?

Friday, 26 February 2010

What are the semantics of your favourite programming language?

"Focusing purely on syntax is like reading Shakespeare and then commenting on the excellent quality of the spelling" -- David Streader

I came across a recent post about the importance of semantics on the Abstract Heresies blog. While very short, it summed up well my general feelings about the attitudes of programmers to things like semantics. I get the feeling that for many, it goes something like this:

Learn the syntax of one language. Do some basic coding. High rate of errors.
Develop intuitions around semantics of that language. Rate of errors decreases.
Learn the syntax of another language. Do some basic coding. Rate of errors slightly less than after learning the syntax of the first language.
Develop intuitions around semantics of new language, relating these intuitions to the intuitions about the semantics of the first language. Rate of errors decreases.

And so on. I should take this opportunity to define what I mean by "semantics" (please excuse the self-referential nature of the last statement and this one). In the world of languages (formal or informal), there exists a distinction between the concrete representation of something, and what it actually means. In programming languages, this is essentially the difference between the program text that you feed into a compiler or interpreter, and the behaviour of the program once it is run. The typical approaches to defining semantics fall into two general categories - operational semantics, where one defines a kind of "abstract machine", specifying how its state is modified upon encountering some construct, and denotational semantics, where constructs in the language are mapped onto other structures that give them meaning (these are usually various kinds of mathematical structures). In human language, this syntax/semantics distinction could be demonstrated in the difference between the sounds that a person makes while speaking ("syntax"), and the understanding that one takes away from those words ("semantics").

Fundamentally, most errors in programs come from a disconnect between a programmer's understandings of the semantics of some system, and the actual semantics of that system. This could mean misunderstanding the effect of some statement in a programming language, or misunderstanding the effect of some statement in a specification (formal or informal).

Consequently, I really think that it is important to emphasise the semantics of programming languages. It seems like there is too much emphasis placed on syntax. Syntax is largely uninteresting - it differs in only trivial ways, even between languages that are radically different.

If I write down this definition:

function mystery 0 = 0

| mystery x = 1 / x

We can figure out what this function does based on our understanding of function application, and of division. That's not "real" syntax for any particular language. However, if we were to concretise it to a language, it would probably look similar in most languages, however we would very quickly begin to encounter semantic distinctions. Some languages will round the result, some will truncate it. Some will keep infinite precision, others will use floats or doubles. I would bet that with 5 different languages picked at random, you could get 5 different results.

These are relatively minor distinctions. How about this one?

function rec x = rec (x + 1)

It's obviously non-terminating, but what exactly does non-termination mean? Some languages will overflow the stack or exhaust some kind of memory limit within a finite number of applications of the function. Some languages may overflow their underlying numerical representation. Some might keep computing until they exhaust main memory. And some languages will never compute this function at all (because the effect is unobservable, so it'll be optimised away).

The effect starts getting even more pronounced when you start talking about objects and concurrency. Are your messages blocking or non-blocking? Is transmission guaranteed? How about ordering? Are there implied synchronisation points or locks? What does it mean for an object in one thread to hold a reference to an object in another thread, and call methods on that object?

While I'm an advocate of formal semantics, I'm a realist. I understand that that would not necessarily enrich the lives of all programmers. But c'mon guys, how about even an informal specification of the semantics of some of these languages that are in common use? Programmer experimentation gets you some of the way, but nothing really beats a rigorous treatment to shine a light in the dark corners of languages where errors hide.

Tuesday, 23 February 2010

The state of the world (and why I still think it's a bad idea)

I've been working recently on context-aware systems - i.e. systems that respond to some measurable things in the user (or computing) environment to infer a context, exhibiting some context-sensitive behaviour as a result. There are lots of things that can define a context, and many ordinary systems can probably be characterised as context-aware, particularly in the mobile computing space.

So, one obvious measured value from the environment is location, which gives rise to the most ubiquitous class of context-aware systems: location based services. In its simplest form, this might mean using a GPS built into a mobile phone or PDA to infer the location of the user, guiding him or her to relevant local points. For example, a user might want to see all the restaurants within walking distance - fire up the GPS, query the database of restaurant locations - done! So far, so pedestrian. There are plenty of services that have done this (and that have done it well).

The complication comes as we wish to integrate multiple sources of information to infer a single abstract notion of a context, or when we wish to integrate discrete information. At the IT University of Copenhagen, we have a system of BLIP (Bluetooth) nodes installed in the corners of the building on every floor. Coverage is not uniform or total, so a device (and therefore a user) may be seen in one corner of the second floor, disappear from the Bluetooth coverage, and moments later reappear on the 4th floor. It is therefore necessary to begin to abstract away from the real measured environment some more general notion of location or space. Adding more sensors measuring different values with different intervals simply compounds the problem further. The disparity between the real environment and the inferred context grows wider and wider.

This conceptual gap is where my problem with notions of global state come in. Building a system that represents state explicitly (and globally) is disingenuous in most cases. In the object oriented world, for example, one might be tempted to have a "User" object that records information about the locations of users. Except, that object is persistent, and the user really isn't. We're now in the world of confidence intervals and uncertainty. If I query the object associated with some user and ask "where are you now?" - the response will likely be a firm and reassured position in some coordinate space, or a reference to a node.

The problem exists because we've built an abstraction that is not a true abstraction of the underlying measured environment. If we base some behaviour on the response from the User object, we're likely to be acting on information that is well and truly out of date. The sensors actually don't necessarily know where a user is at any given moment, only that the user was seen in a location at some time. If we shift our abstraction to work on this basis instead (perhaps the User object returns a "last seen" timestamp and a location), then what have we gained from this abstraction? We can start to build a whole host of largely equivalent abstractions, none of which are particularly helpful, because they all rely on a notion of having total knowledge of the state of the world at all times. The kinds of stateful representations provided by most mainstream programming languages are, I argue, poor models of many measurable environments. Without building higher-order abstractions over the abstractions inherent in the language, we are forced to either build useless abstractions, or hide too much necessary detail.

If you agree with this premise, then you may wonder what the solution is. In short, it's reactiveness. In order to interact with the measured environment producing discrete values, programs must not rely on state information about the environment, rather, new facts must be computed as the measurements are made. When something changes in the real environment, programs must react to this change, and emit new events to be consumed by other programs (or parts of the program). In this way, idioms such as Functional Reactive Programming seem well suited to the task. Even outside the world of context-aware computing, it seems that persistent global state is often smoke and mirrors hiding potential data currency issues.

So the question I ask is: do you really need it?

Monday, 7 December 2009

By Request: My thoughts on Perl 6

Good idea: Redesigning a clearly broken language from scratch

Bad idea:

multi unique (*$x, *@xs) { member($x, |@xs) ?? unique(|@xs) !! ($x, unique(|@xs)) }

Monday, 5 October 2009

The bug-free myth: Why we trust verified software

Any time there is a discussion of formal proofs of correctness in software, I hear the same arguments made. These arguments are almost invariably made by people who have never done any formal verification, and who don't necessarily have a particularly good knowledge of how formal verification even works. This post will attempt to provide both a few counter-arguments against the usual objections, and an overview of the theoretical basis upon which formal verification of software depends.

The Bug-free myth

All too often, I hear the phrase "free of bugs" used in conjunction with theorem proving or formal verification of software. I want to make perfectly clear, this is not what theorem proving claims. You'll often hear phrases like "guaranteed correct". This is an (informal) shorthand for "guaranteed correct with respect to some specification". Allow me to make the distinction clear with an example.

Here is a program:


x = 0;
while(1) {
x = x + 1;
}

This program isn't very useful. It is not, however, incorrect (nor is it "correct"), without reference to some specification. If our specification is simply "keep incrementing a variable until the universe ends", this program is correct. If we were expecting this program to calculate something useful, or produce some output, then it is obviously incorrect. The "correctness" judgement always follows from a specification, and no judgement can be made in the absence of a specification.

This definition covers the (informal) usage of "correctness", but this is still distinct from "bug-free". The main activity within formal verification of software is generally the verification of specific safety and liveness properties.

"Safety" is approximately the guarantee that some bad thing can never happen. It might be that a certain storage location can never overflow, or that a value is always positive, or that a pointer is not null, or even much more concrete functional things, like "the autopilot will not switch off the engines while the plane is in flight". Keep in mind that doing nothing is always "safe". It's impossible for a storage location to overflow if the program never does anything. A program that simply halts immediately will fulfil almost any safety property you care to dream up, so we can re-state the definition as "if anything happens, it will be something desirable".

This is where liveness comes in. This is approximately the guarantee that "some desirable thing will happen (eventually)". These might be properties like "the doors will open", or "the value stored in X will be incremented".

When people talk about software being verified, they really mean that a set of safety and liveness properties have been shown to be true of the software in question. When undertaking a verification activity, we choose a set of safety and liveness properties that are of interest or importance, and we prove those. A property like "there are no bugs" is completely useless (and unprovable), however, for some safety property X, we can at least say "there are no bugs that would cause property X to become false".

One of the main ways we verify these safety and liveness properties is through the use of automated or interactive theorem provers. These are pieces of software that permit the construction of mathematical proofs, by both checking the proofs, and by providing some assistance to a human in searching for a suitable proof that some property holds. Unfortunately, as soon as you mention theorem provers and all the wonderful things they do, this usually leads to a new objection.

But what if the theorem prover has bugs in it?

OK, it seems a natural enough objection, but you don't think we're completely dumb, do you? For people in the business of proving software correct, you have to assume that we have some reason to believe the software we use is correct.

But actually, correctness within theorem provers is less important than you might think. I said earlier that a theorem prover has two basic functions - it acts as a proof checker, and it provides tools to aid in coming up with a suitable proof of a property.

The important part of this is that only the proof checker needs to be correct. The other parts of the theorem prover could just generate random garbage proofs, and they would be rejected as incorrect proofs of the property under consideration by the proof checker. In practice, the techniques used in proof search have some very good heuristics about the chance that they will succeed in finding a valid proof, and the time it will take for them to find such a proof, but these are just for the benefit of their users. These proof-search aids do not simply return "true" if they find a valid proof - they instead must return the proof itself, which the proof-checker can then verify.

So, hopefully you have been suitably convinced that most of the theorem prover does not need to be correct.

So how do you know that the proof checker is correct?

I'm glad you asked. This is going to require a quick whirlwind tour of how mathematical logic works.

Theorem provers like Isabelle and Coq have a wonderfully rich set of constructs built into them - numbers, arithmetic, records, functions - and the list goes on. The important fact about all of these things is that they can be constructed from a very small set of much simpler constructs.

There are a class of logics called Hilbert Systems that are built using a very small number of axioms and inference rules, and which permit you to build up most of the familiar constructs from logic and mathematics in a systematic way. Similarly, Zermelo–Fraenkel set theory also allows one to build up almost all of mathematics in a systematic way. This is old magic, and it is pretty well understood.

Because I'm writing this for a general audience, I don't feel the need to scrape the depths of my own knowledge to come up with the specifics of the logical foundations used within specific theorem provers, but if you get the general idea that there have been many attempts to formalise all of mathematics from a very small set of core constructs, and many of them have been successful enough to give us a formalised foundation for the types of mathematics most of us care about, then you already know how a theorem prover works.

So once we've encoded this very small core manually, we can build all the rest of our theorem prover theories from this small core. We can mechanically check that we are doing this right, either by hand, or using various automated proof-checking techniques.

The small core inside the theorem prover is all that is really important now, in terms of correctness. This can be verified by hand, and it can also be verified using other theorem provers.

But even this doesn't really need to be correct. When we use a theorem prover to aid us in proving a property, we end up with a proof script - some machine-readable version of the proof. This doesn't necessarily rely upon the software that was used to help create it (although it might get messy trying to decouple it). This proof can be verified using any other piece of software, or it can be verified by hand without any creativity - you just have to mechanically apply each step in the proof and satisfy yourself that each step follows by a valid step of inference from the last.

The final important property of theorem proving software is that we can always look for consistency. If we get anything wrong anywhere along the way, proofs are going to start failing. It is already true that there are some valid statements that cannot be proved to be true in the logic in question, but this isn't really a problem as far as the correctness of the theorem prover is concerned. We would only be concerned if there were false statements that could somehow be shown to be true inside our logical system - and this is not the case! It is erroneous to prove a contradiction, so if this happens, mechanical proof checking will reject the proof.

But what if the proof has bugs in it?

Hopefully you are satisfied that we can have a very high degree of confidence in the results we get out of a theorem prover, so a proof is not like a "program" in the sense that we can have valid steps of reasoning and just "get the wrong answer". The only place where real "bugs" can be introduced is in the human side of things, where we choose the wrong safety or liveness properties that do not actually capture the properties we are interested in.

There are several strategies that we can use here. The first is syntax and type checking on the specification itself. Given that we are expressing our safety properties inside a theorem prover using something that is essentially a programming language, we can use the same techniques that have been shown to be very useful in enforcing consistency and detecting errors in traditional programming languages. This should remove the possibility that a simple typo could be totally catastrophic.

The second way is enforcing a discipline where we re-state the same properties in different ways. By doing this, we ensure that either a proof of one of the properties will fail, or we will introduce inconsistency between our encodings of the properties which will be rejected by the theorem prover.

Finally, and although I've previously argued against it, be careful. This is very different to the suggestion that you just "be careful" when constructing a program in a traditional language. This is about a set of safety properties. You can only guarantee that your program is correct with respect to the specification that you have written down. If you are falsely claiming confidence in your software where none exists, then that would be just as catastrophic in any environment. I really think it's worth saying here that we are way off into the land of the abstract. Even in a complex specification, it would be very difficult to encode things in such a way that you succeed in proving a whole set of properties, none of which provide the guarantees you think you are making (the proof is likely to fail if you have an inconsistent model of what you think you are proving and what you are actually proving). This is a much stronger safeguard against human error than is provided in any traditional programming environment available today.

But what if the compiler is incorrect?

There are two basic counter-arguments to this. Basically, in proving properties about software, you are either working with an idealised model of the software encoded into the theorem prover assuming some particular semantics for the target language and/or underlying hardware, or you are actually writing the software inside the theorem prover, and then generating code once you have proven all the properties of interest. In either case, you are somewhat at the mercy of compilers. This is always the case though - any correct program could be made incorrect if there are faults in the compiler or the hardware. You can take two approaches here - assume that the hardware and compiler work as expected (therefore invalidating your guarantees of correctness if they do not), or make use of the recent work on certified compilation to at least remove the uncertainty of the compiler from the equation. There is a whole different body of work on (hardware) fault tolerance in software. If you're really getting to this level of safety, then you probably need to be an expert on these kinds of issues anyway. Theorem provers for software verification are concerned with the verification of certain properties. If validating your assumptions about the hardware and software environments are important, then you will need to do more work.

Is it worth it?

For some things, yes. This kind of formal software verification is hard work. It's very labour intensive, and it requires a great deal of effort. That said, it is getting easier as the tools get better, the barrier to entry is getting lower. I don't think it will be very long before we start seeing a lot more in the way of light-weight theorem proving taking place as a standard part of compilation. In a world where software failure costs huge amount of money, it doesn't seem so remote a possibility that it will become practical and cost-efficient to apply formal verification to more software.

--

Personal plea: I'm still looking for reasonably-priced accommodation for my partner and I in Copenhagen if anyone has any leads. I'm eager to go start my PhD at ITU, but need to find somewhere to live first.