<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-4376947664270917518</id><updated>2011-11-27T23:57:33.452Z</updated><title type='text'>Adventures in Programming Languages and Semantics</title><subtitle type='html'>I write lots of words about lots of things.  Some of them are factual, some of them are pure opinion pieces.</subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://plsadventures.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4376947664270917518/posts/default?max-results=100'/><link rel='alternate' type='text/html' href='http://plsadventures.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><author><name>Gian</name><uri>http://www.blogger.com/profile/04384217245734242968</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='26' src='http://1.bp.blogspot.com/_AhA71miQEj0/SoqxZcE7mlI/AAAAAAAAAAM/SCAy8ISor6w/S220/zoinks.jpg'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>24</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>100</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-4376947664270917518.post-5929740091838323750</id><published>2011-04-13T15:55:00.010+01:00</published><updated>2011-04-13T17:08:22.601+01:00</updated><title type='text'>NZ Copyright Amendment Letter</title><content type='html'>I don't usually post non-PL-related things here, but there is some &lt;a href="http://www.stuff.co.nz/technology/digital-living/4882838/Law-to-fight-internet-piracy-rushed-through"&gt;legislation brewing in New Zealand&lt;/a&gt; that is deeply unsettling to me, and so I'm publishing here the letter I wrote to my local Member of Parliament for the area which I cast my vote in:&lt;br /&gt;&lt;br /&gt;&lt;p&gt;&lt;br /&gt;Dear Mr Bennett,&lt;/p&gt;&lt;p&gt;I am writing to you as a constituent to express my deep concern for both the substance of the Copyright (Infringing File Sharing) Amendment currently before the house, and for the procedure by which this legislation has reached its current status.&lt;br /&gt;&lt;br /&gt;The use of Parliamentary urgency under the auspices of assisting with the aftermath of the Canterbury earthquake to advance this controversial and flawed legislation is deeply disturbing to me.  It reflects a perversion of the democratic process in which due time and consideration is given to the will of the people.  This attempt to rush legislation with no particular time pressure attached to it through the house reflects poorly upon the dignity and legitimacy of our democracy.&lt;br /&gt;&lt;br /&gt;Furthermore, the tone and substance of the debate in the house suggested that the majority of the MPs are not well-informed about the complex technical, legal and social implications of this legislation, particularly the "disconnection clause" that gives an extra-judicial remedy to intellectual property owners who are not required to submit to the usual burden of proof as in any other civil action.  Given the fundamental nature of the internet within New Zealand business, education and personal life, this remedy that may be applied in the presence of a mere accusation seems disproportionate and set to disadvantage both the businesses that provide internet services within New Zealand, and the consumers of internet services who rely on them for so many facets of daily life.&lt;br /&gt;&lt;br /&gt;As a doctoral student in Computer Science, the many technical and social issues that remain unaddressed by this legislation are perhaps more worrying to me than to those members of Parliament currently advancing this legislation.  I (and many others within the internet community in New Zealand) feel that this legislation, while addressing a perceived problem, has not been adequately discussed and refined to the point where it could be effective in addressing the purported issues.  This is another reason why attempting to effect this legislation so quickly is inappropriate.&lt;br /&gt;&lt;br /&gt;I hope that you will take my genuine concerns in the spirit that they are given, and that you can be persuaded to oppose both the process and substance of this legislation until such a time as the full implications can be considered and discussed by the public at large, and effective, fair legislation can be drafted that will encourage the development of healthy creative industries and a healthy internet industry within New Zealand.&lt;br /&gt;&lt;br /&gt;Yours sincerely,&lt;br /&gt;&lt;br /&gt;Gian Perrone, B.Sc (Hons), M.Sc (Hons)&lt;br /&gt;&lt;/p&gt;&lt;p&gt;As an aside, be warned that if this legislation passes, anyone reading this letter on this blog is probably guilty of &lt;b&gt;*gasp*&lt;/b&gt; &lt;i&gt;file sharing&lt;/i&gt;!&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4376947664270917518-5929740091838323750?l=plsadventures.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://plsadventures.blogspot.com/feeds/5929740091838323750/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://plsadventures.blogspot.com/2011/04/i-dont-usually-post-non-pl-related.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4376947664270917518/posts/default/5929740091838323750'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4376947664270917518/posts/default/5929740091838323750'/><link rel='alternate' type='text/html' href='http://plsadventures.blogspot.com/2011/04/i-dont-usually-post-non-pl-related.html' title='NZ Copyright Amendment Letter'/><author><name>Gian</name><uri>http://www.blogger.com/profile/04384217245734242968</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='26' src='http://1.bp.blogspot.com/_AhA71miQEj0/SoqxZcE7mlI/AAAAAAAAAAM/SCAy8ISor6w/S220/zoinks.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4376947664270917518.post-1249032103868731274</id><published>2010-08-01T17:22:00.003+01:00</published><updated>2010-08-01T17:40:34.726+01:00</updated><title type='text'>"It's a hardware problem" is a programmer problem</title><content type='html'>I've heard it said too often by programmers "it's a hardware problem", usually with a bit of back-slapping and joviality, as thought the catch all "hardware problem" descriptor frees the programmer from any responsibility to investigate further.  While the spirit of such comments is usually tongue-in-cheek, I can't help but feel it's symptomatic of the view taken of the machinery of computer by too many programmers:  that it is somehow magical.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Programmers make a lot of assumptions about hardware, and modern operating systems allow (and encourage!) this.  I'm definitely not suggesting that programmers should constantly be shifting to a lower level of abstraction (quite the opposite, I'm all for more abstractions!), but this has to be done with a mind to the fact that abstractions are exactly that - they are &lt;i&gt;abstractions of concrete systems&lt;/i&gt;.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Areas where I see the most mistakes, in no particular order:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;ul&gt;&lt;li&gt;Time&lt;/li&gt;&lt;li&gt;Performance&lt;/li&gt;&lt;li&gt;Concurrency&lt;/li&gt;&lt;/ul&gt;&lt;div&gt;Let's examine them briefly.&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-size: large;"&gt;&lt;b&gt;Time&lt;/b&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Oh god.  Why does time seem to turn programmers silly?  There is a well-known phenomenon in physics: &lt;i&gt;No two clocks are ever the same&lt;/i&gt;.  Computers obey the same property.  Even at relatively large time scales, clocks on two different machines will always be different.  Trying to line up time stamps exactly is never, ever going to work.  This is assuming you have already managed to account for all the delays in timing caused by software overheads and scheduling delays.  Even the hardware clocks are not the same.  So please, don't assume that the hardware supplying your calls to system clocks is somehow magical.  It's really not.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Performance&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Assuming you can predict anything about the real-time characteristics of a piece of software without extensive analysis is  the ultimate form of self-delusion, and yet it is one that people regularly seem to engage in.  Why, oh why!?  Given the amount of instruction re-ordering, instruction-level parallelism and caching that goes on inside a modern CPU, it's worth either understanding exactly what is going on, or treating it as completely non-deterministic.  The situation gets more complex when you start talking about the time involved in access to memory or (god forbid!) hard drives.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Concurrency&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;With the advent of modern commodity multicore hardware, there seems to be a tendency amongst some programmers to think about concurrency in terms of magical hardware, obviating the need to think about all of the extra possibilities for race conditions that this introduces.  The hardware is not magical and doesn't do anybody any favours.  Only sensible up-front design mixed with an approach that deals with the unpredictability of hardware performance in concurrent situations will yield the kinds of successes we hope for.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Let's banish the magical thinking about hardware from our discipline, huh?&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4376947664270917518-1249032103868731274?l=plsadventures.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://plsadventures.blogspot.com/feeds/1249032103868731274/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://plsadventures.blogspot.com/2010/08/its-hardware-problem-is-programmer.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4376947664270917518/posts/default/1249032103868731274'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4376947664270917518/posts/default/1249032103868731274'/><link rel='alternate' type='text/html' href='http://plsadventures.blogspot.com/2010/08/its-hardware-problem-is-programmer.html' title='&quot;It&apos;s a hardware problem&quot; is a programmer problem'/><author><name>Gian</name><uri>http://www.blogger.com/profile/04384217245734242968</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='26' src='http://1.bp.blogspot.com/_AhA71miQEj0/SoqxZcE7mlI/AAAAAAAAAAM/SCAy8ISor6w/S220/zoinks.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4376947664270917518.post-7967274315501588896</id><published>2010-02-26T13:04:00.004Z</published><updated>2010-02-26T13:43:36.536Z</updated><title type='text'>What are the semantics of your favourite programming language?</title><content type='html'>&lt;div&gt;&lt;i&gt;"Focusing purely on syntax is like reading Shakespeare and then commenting on the excellent quality of the spelling" -- David Streader&lt;/i&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;I came across a&lt;a href="http://funcall.blogspot.com/2010/02/long-series-of-posts.html"&gt; recent post about the importance of semantics&lt;/a&gt; on the &lt;a href="http://funcall.blogspot.com/"&gt;Abstract Heresies&lt;/a&gt; blog.  While very short, it summed up well my general feelings about the attitudes of programmers to things like semantics.  I get the feeling that for many, it goes something like this:&lt;div&gt;&lt;ul&gt;&lt;li&gt;Learn the syntax of one language.  Do some basic coding.  High rate of errors.&lt;/li&gt;&lt;li&gt;Develop intuitions around semantics of that language.  Rate of errors decreases.&lt;/li&gt;&lt;li&gt;Learn the syntax of another language.  Do some basic coding.  Rate of errors slightly less than after learning the syntax of the first language.&lt;/li&gt;&lt;li&gt;Develop intuitions around semantics of new language, relating these intuitions to the intuitions about the semantics of the first language.  Rate of errors decreases.&lt;/li&gt;&lt;/ul&gt;&lt;div&gt;And so on. I should take this opportunity to define what I mean by "semantics" (please excuse the self-referential nature of the last statement and this one).  In the world of languages (formal or informal), there exists a distinction between the concrete representation of something, and what it actually &lt;i&gt;means&lt;/i&gt;.  In programming languages, this is essentially the difference between the program text that you feed into a compiler or interpreter, and the behaviour of the program once it is run.  The typical approaches to defining semantics fall into two general categories - &lt;i&gt;operational semantics&lt;/i&gt;, where one defines a kind of "abstract machine", specifying how its state is modified upon encountering some construct, and &lt;i&gt;denotational semantics&lt;/i&gt;, where constructs in the language are mapped onto other structures that give them meaning (these are usually various kinds of mathematical structures).  In human language, this syntax/semantics distinction could be demonstrated in the difference between the sounds that a person makes while speaking ("syntax"), and the understanding that one takes away from those words ("semantics").&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Fundamentally, most errors in programs come from a disconnect between a programmer's understandings of the semantics of some system, and the actual semantics of that system.  This could mean misunderstanding the effect of some statement in a programming language, or misunderstanding the effect of some statement in a specification (formal or informal).&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Consequently, I really think that it is important to emphasise the semantics of programming languages.  It seems like there is too much emphasis placed on syntax.  Syntax is largely uninteresting - it differs in only trivial ways, even between languages that are radically different.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;If I write down this definition:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;function mystery 0 = 0&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;       | mystery x = 1 / x&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;We can figure out what this function does based on our understanding of function application, and of division.  That's not "real" syntax for any particular language.  However, if we were to concretise it to a language, it would probably look similar in most languages, however we would very quickly begin to encounter semantic distinctions.  Some languages will round the result, some will truncate it.  Some will keep infinite precision, others will use floats or doubles.  I would bet that with 5 different languages picked at random, you could get 5 different results.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;These are relatively minor distinctions.  How about this one?&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;function rec x = rec (x + 1)&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;It's obviously non-terminating, but what exactly does non-termination mean?  Some languages will overflow the stack or exhaust some kind of memory limit within a finite number of applications of the function.  Some languages may overflow their underlying numerical representation.  Some might keep computing until they exhaust main memory.  And some languages will never compute this function at all (because the effect is unobservable, so it'll be optimised away).&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The effect starts getting even more pronounced when you start talking about objects and concurrency.  Are your messages blocking or non-blocking?  Is transmission guaranteed?  How about ordering?  Are there implied synchronisation points or locks?  What does it mean for an object in one thread to hold a reference to an object in another thread, and call methods on that object?&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;While I'm an advocate of formal semantics, I'm a realist.  I understand that that would not necessarily enrich the lives of all programmers.  But c'mon guys, how about even an &lt;i&gt;informal&lt;/i&gt; specification of the semantics of some of these languages that are in common use?  Programmer experimentation gets you some of the way, but nothing really beats a rigorous treatment to shine a light in the dark corners of languages where errors hide.&lt;/div&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4376947664270917518-7967274315501588896?l=plsadventures.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://plsadventures.blogspot.com/feeds/7967274315501588896/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://plsadventures.blogspot.com/2010/02/what-are-semantics-of-your-favourite.html#comment-form' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4376947664270917518/posts/default/7967274315501588896'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4376947664270917518/posts/default/7967274315501588896'/><link rel='alternate' type='text/html' href='http://plsadventures.blogspot.com/2010/02/what-are-semantics-of-your-favourite.html' title='What are the semantics of your favourite programming language?'/><author><name>Gian</name><uri>http://www.blogger.com/profile/04384217245734242968</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='26' src='http://1.bp.blogspot.com/_AhA71miQEj0/SoqxZcE7mlI/AAAAAAAAAAM/SCAy8ISor6w/S220/zoinks.jpg'/></author><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4376947664270917518.post-3641717446643691192</id><published>2010-02-23T14:30:00.003Z</published><updated>2010-02-23T15:00:15.933Z</updated><title type='text'>The state of the world (and why I still think it's a bad idea)</title><content type='html'>I've been working recently on context-aware systems - i.e. systems that respond to some measurable things in the user (or computing) environment to infer a &lt;i&gt;context&lt;/i&gt;, exhibiting some context-sensitive behaviour as a result.  There are lots of things that can define a context, and many ordinary systems can probably be characterised as context-aware, particularly in the mobile computing space.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;So, one obvious measured value from the environment is &lt;i&gt;location&lt;/i&gt;, which gives rise to the most ubiquitous class of context-aware systems: &lt;i&gt;location based services&lt;/i&gt;.  In its simplest form, this might mean using a GPS built into a mobile phone or PDA to infer the location of the user, guiding him or her to relevant local points.  For example, a user might want to see all the restaurants within walking distance - fire up the GPS, query the database of restaurant locations - done!  So far, so pedestrian.  There are plenty of services that have done this (and that have done it well).&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The complication comes as we wish to integrate multiple sources of information to infer a single abstract notion of a context, or when we wish to integrate discrete information.  At the &lt;a href="http://www.itu.dk/"&gt;IT University of Copenhagen&lt;/a&gt;, we have a system of BLIP (Bluetooth) nodes installed in the corners of the building on every floor.  Coverage is not uniform or total, so a device (and therefore a user) may be seen in one corner of the second floor, disappear from the Bluetooth coverage, and moments later reappear on the 4th floor.  It is therefore necessary to begin to abstract away from the real measured environment some more general notion of location or space.  Adding more sensors measuring different values with different intervals simply compounds the problem further.  The disparity between the real environment and the inferred context grows wider and wider.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;This conceptual gap is where my problem with notions of global state come in.  Building a system that represents state explicitly (and globally) is disingenuous in most cases.  In the object oriented world, for example, one might be tempted to have a "User" object that records information about the locations of users.  Except, that object is persistent, and the user really isn't.  We're now in the world of confidence intervals and uncertainty.  If I query the object associated with some user and ask "where are you now?" - the response will likely be a firm and reassured position in some coordinate space, or a reference to a node.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The problem exists because we've built an abstraction that is not a true abstraction of the underlying measured environment.  If we base some behaviour on the response from the User object, we're likely to be acting on information that is well and truly out of date.  The sensors actually don't necessarily know where a user is at any given moment, only that the user was seen in a location at some time.  If we shift our abstraction to work on this basis instead (perhaps the User object returns a "last seen" timestamp and a location), then what have we gained from this abstraction?  We can start to build a whole host of largely equivalent abstractions, none of which are particularly helpful, because they all rely on a notion of having total knowledge of the state of the world at all times.  The kinds of stateful representations provided by most mainstream programming languages are, I argue, poor models of many measurable environments.  Without building higher-order abstractions over the abstractions inherent in the language, we are forced to either build useless abstractions, or hide too much necessary detail.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;If you agree with this premise, then you may wonder what the solution is.  In short, it's reactiveness.  In order to interact with the measured environment producing discrete values, programs must not rely on state information about the environment, rather, new facts must be computed as the measurements are made.  When something changes in the real environment, programs must react to this change, and emit new events to be consumed by other programs (or parts of the program).  In this way, idioms such as &lt;a href="http://en.wikipedia.org/wiki/Functional_reactive_programming"&gt;Functional Reactive Programming&lt;/a&gt; seem well suited to the task.  Even outside the world of context-aware computing, it seems that persistent global state is often smoke and mirrors hiding potential data currency issues.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;So the question I ask is:  do you really need it?&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4376947664270917518-3641717446643691192?l=plsadventures.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://plsadventures.blogspot.com/feeds/3641717446643691192/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://plsadventures.blogspot.com/2010/02/state-of-world-and-why-i-still-think.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4376947664270917518/posts/default/3641717446643691192'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4376947664270917518/posts/default/3641717446643691192'/><link rel='alternate' type='text/html' href='http://plsadventures.blogspot.com/2010/02/state-of-world-and-why-i-still-think.html' title='The state of the world (and why I still think it&apos;s a bad idea)'/><author><name>Gian</name><uri>http://www.blogger.com/profile/04384217245734242968</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='26' src='http://1.bp.blogspot.com/_AhA71miQEj0/SoqxZcE7mlI/AAAAAAAAAAM/SCAy8ISor6w/S220/zoinks.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4376947664270917518.post-1031879750980877130</id><published>2009-12-07T04:27:00.002Z</published><updated>2009-12-07T04:34:32.774Z</updated><title type='text'>By Request:  My thoughts on Perl 6</title><content type='html'>Good idea:  Redesigning a clearly broken language from scratch&lt;br /&gt;&lt;br /&gt;Bad idea:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;multi unique (*$x, *@xs) { member($x, |@xs) ?? unique(|@xs) !! ($x, unique(|@xs)) }&lt;br /&gt;&lt;br /&gt;&lt;/pre&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4376947664270917518-1031879750980877130?l=plsadventures.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://plsadventures.blogspot.com/feeds/1031879750980877130/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://plsadventures.blogspot.com/2009/12/by-request-my-thoughts-on-perl-6.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4376947664270917518/posts/default/1031879750980877130'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4376947664270917518/posts/default/1031879750980877130'/><link rel='alternate' type='text/html' href='http://plsadventures.blogspot.com/2009/12/by-request-my-thoughts-on-perl-6.html' title='By Request:  My thoughts on Perl 6'/><author><name>Gian</name><uri>http://www.blogger.com/profile/04384217245734242968</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='26' src='http://1.bp.blogspot.com/_AhA71miQEj0/SoqxZcE7mlI/AAAAAAAAAAM/SCAy8ISor6w/S220/zoinks.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4376947664270917518.post-3525843592997632734</id><published>2009-10-05T12:18:00.010+01:00</published><updated>2009-10-05T13:38:39.678+01:00</updated><title type='text'>The bug-free myth: Why we trust verified software</title><content type='html'>Any time there is a discussion of formal proofs of correctness in software, I hear the same arguments made.  These arguments are almost invariably made by people who have never done any formal verification, and who don't necessarily have a particularly good knowledge of how formal verification even works.  This post will attempt to provide both a few counter-arguments against the usual objections, and an overview of the theoretical basis upon which formal verification of software depends.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;The Bug-free myth&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;All too often, I hear the phrase "free of bugs" used in conjunction with theorem proving or formal verification of software.  I want to make perfectly clear, &lt;span style="font-weight: bold;"&gt;this is not what theorem proving claims&lt;/span&gt;.  You'll often hear phrases like "guaranteed correct".  This is an (informal) shorthand for "guaranteed correct &lt;span style="font-style: italic;"&gt;with respect to some specification&lt;/span&gt;".  Allow me to make the distinction clear with an example.&lt;br /&gt;&lt;br /&gt;Here is a program:&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;x = 0;&lt;br /&gt;while(1) {&lt;br /&gt;x = x + 1;&lt;br /&gt;}&lt;br /&gt;&lt;/pre&gt;This program isn't very useful.  It is not, however, &lt;span style="font-style: italic;"&gt;incorrect&lt;/span&gt; (nor is it "correct"), without reference to some specification.  If our specification is simply "keep incrementing a variable until the universe ends", this program is correct.  If we were expecting this program to calculate something useful, or produce some output, then it is obviously&lt;span style="font-style: italic;"&gt; &lt;/span&gt;incorrect&lt;span style="font-style: italic;"&gt;.&lt;span style="font-style: italic;"&gt;&lt;/span&gt;&lt;/span&gt;  The "correctness" judgement always follows from a specification, and no judgement can be made in the absence of a specification.&lt;br /&gt;&lt;br /&gt;This definition covers the (informal) usage of "correctness", but this is still distinct from "bug-free".  The main activity within formal verification of software is generally &lt;span style="font-style: italic;"&gt;&lt;/span&gt;the verification of specific &lt;span style="font-style: italic;"&gt;safety&lt;/span&gt; and &lt;span style="font-style: italic;"&gt;liveness&lt;/span&gt; properties.&lt;br /&gt;&lt;br /&gt;"Safety" is approximately the guarantee that some bad thing can never happen.  It might be that a certain storage location can never overflow, or that a value is always positive, or that a pointer is not null, or even much more concrete functional things, like "the autopilot will not switch off the engines while the plane is in flight".  Keep in mind that doing &lt;span style="font-style: italic;"&gt;nothing&lt;/span&gt; is always "safe".  It's impossible for a storage location to overflow if the program never does anything.  A program that simply halts immediately will fulfil almost any safety property you care to dream up, so we can re-state the definition as "if anything happens, it will be something desirable".&lt;br /&gt;&lt;br /&gt;This is where &lt;span style="font-style: italic;"&gt;liveness&lt;/span&gt; comes in.  This is approximately the guarantee that "some desirable thing will happen (eventually)".  These might be properties like "the doors will open", or "the value stored in X will be incremented".&lt;br /&gt;&lt;br /&gt;When people talk about software  being &lt;span style="font-style: italic;"&gt;verified&lt;/span&gt;, they really mean that a set of safety and liveness properties have been shown to be true of the software in question.  When undertaking a verification activity, we choose a set of safety and liveness properties that are of interest or importance, and we prove those.  A property like "there are no bugs" is completely useless (and unprovable), however, for some safety property X, we can at least say "there are no bugs that would cause property X to become false".&lt;br /&gt;&lt;br /&gt;One of the main ways we verify these safety and liveness properties is through the use of automated or interactive &lt;span style="font-style: italic;"&gt;&lt;span style="font-style: italic;"&gt;&lt;/span&gt;theorem provers.&lt;/span&gt;  These are pieces of software that permit the construction of mathematical proofs, by both &lt;span style="font-style: italic;"&gt;checking&lt;/span&gt; the proofs, and by providing some assistance to a human in searching for a suitable proof that some property holds.  Unfortunately, as soon as you mention theorem provers and all the wonderful things they do, this usually leads to a new objection.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;But what if the theorem prover has bugs in it&lt;/span&gt;?&lt;br /&gt;&lt;br /&gt;OK, it seems a natural enough objection, but you don't think we're completely dumb, do you?  For people in the business of proving software correct, you have to assume that we have some reason to believe the software we use is correct.&lt;br /&gt;&lt;br /&gt;But actually, &lt;span style="font-style: italic;"&gt;correctness&lt;/span&gt; within theorem provers is less important than you might think.  I said earlier that a theorem prover has two basic functions - it acts as a &lt;span style="font-style: italic;"&gt;proof checker&lt;/span&gt;, and it provides tools to aid in coming up with a suitable proof of a property.&lt;br /&gt;&lt;br /&gt;The important part of this is that only the proof checker needs to be correct.  The other parts of the theorem prover could just generate random garbage proofs, and they would be rejected as incorrect proofs of the property under consideration by the proof checker.  In practice, the techniques used in proof search have some very good heuristics about the chance that they will succeed in finding a valid proof, and the time it will take for them to find such a proof, but these are just for the benefit of their users.  These proof-search aids do not simply return "true" if they find a valid proof - they instead must return the proof itself, which the proof-checker can then verify.&lt;br /&gt;&lt;br /&gt;So, hopefully you have been suitably convinced that &lt;span style="font-style: italic;"&gt;most&lt;/span&gt; of the theorem prover does not need to be correct.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;So how do you know that the proof checker is correct?&lt;span style="font-style: italic;"&gt;&lt;br /&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;I'm glad you asked.  This is going to require a quick whirlwind tour of how mathematical logic works.&lt;br /&gt;&lt;br /&gt;Theorem provers like &lt;a href="http://www.cl.cam.ac.uk/research/hvg/Isabelle/"&gt;Isabelle&lt;/a&gt; and &lt;a href="http://coq.inria.fr/"&gt;Coq&lt;/a&gt; have a wonderfully rich set of constructs built into them - numbers, arithmetic, records, functions - and the list goes on.  The important fact about all of these things is that they can be constructed from a very small set of much simpler constructs.&lt;br /&gt;&lt;br /&gt;There are a class of logics called &lt;a href="http://en.wikipedia.org/wiki/Hilbert_system"&gt;Hilbert Systems&lt;/a&gt; that are built using a very small number of axioms and inference rules, and which permit you to build up most of the familiar constructs from logic and mathematics in a systematic way.  Similarly, &lt;a href="http://en.wikipedia.org/wiki/Zermelo%E2%80%93Fraenkel_set_theory"&gt;Zermelo–Fraenkel set theory&lt;/a&gt; also allows one to build up almost all of mathematics in a systematic way.  This is old magic, and it is pretty well understood.&lt;br /&gt;&lt;br /&gt;Because I'm writing this for a general audience, I don't feel the need to scrape the depths of my own knowledge to come up with the specifics of the logical foundations used within specific theorem provers, but if you get the general idea that there have been  many attempts to formalise all of mathematics from a very small set of core constructs, and many of them have been successful enough to give us a formalised foundation for the types of mathematics most of us care about, then you already know how a theorem prover works.&lt;br /&gt;&lt;br /&gt;So once we've encoded this very small core manually, we can build all the rest of our theorem prover theories from this small core.  We can mechanically check that we are doing this right, either by hand, or using various &lt;a href="http://en.wikipedia.org/wiki/Automated_proof_checking"&gt;automated proof-checking&lt;/a&gt; techniques.&lt;br /&gt;&lt;br /&gt;The small core inside the theorem prover is all that is really important now, in terms of correctness.  This can be verified by hand, and it can also be verified using other theorem provers.&lt;br /&gt;&lt;br /&gt;But even this doesn't &lt;span style="font-style: italic;"&gt;really&lt;/span&gt; need to be correct.  When we use a theorem prover to aid us in proving a property, we end up with a &lt;span style="font-style: italic;"&gt;proof script&lt;/span&gt; - some machine-readable version of the proof.  This doesn't necessarily rely upon the software that was used to help create it (although it might get messy trying to decouple it).  This proof can be verified using any other piece of software, or it can be verified by hand without any creativity - you just have to mechanically apply each step in the proof and satisfy yourself that each step follows by a valid step of inference from the last.&lt;br /&gt;&lt;br /&gt;The final important property of theorem proving software is that we can always look for &lt;span style="font-style: italic;"&gt;consistency&lt;/span&gt;.  If we get anything wrong anywhere along the way, proofs are going to start failing.  It is already true that there are some &lt;a href="http://en.wikipedia.org/wiki/G%C3%B6del%27s_incompleteness_theorems"&gt;valid statements that cannot be proved to be true&lt;/a&gt; in the logic in question, but this isn't really a problem as far as the correctness of the theorem prover is concerned.  We would only be concerned if there were false statements that could somehow be shown to be true inside our logical system - and this is not the case!  It is erroneous to prove a contradiction, so if this happens, mechanical proof checking will reject the proof.&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;&lt;span style="font-style: italic;"&gt;&lt;span style="font-weight: bold;"&gt;&lt;span style="font-style: italic;"&gt;&lt;span style="font-weight: bold;"&gt;&lt;span style="font-style: italic;"&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;/span&gt;But what if the proof has bugs in it&lt;/span&gt;?&lt;br /&gt;&lt;br /&gt;Hopefully you are satisfied that we can have a very high degree of confidence in the results we get out of a theorem prover, so a proof is not like a "program" in the sense that we can have valid steps of reasoning and just "get the wrong answer".  The only place where real "bugs" can be introduced is in the human side of things, where we choose the wrong safety or liveness properties that do not actually capture the properties we are interested in.&lt;br /&gt;&lt;br /&gt;There are several strategies that we can use here.  The first is syntax and type checking on the specification itself.  Given that we are expressing our safety properties inside a theorem prover using something that is essentially a programming language, we can use the same techniques that have been shown to be very useful in enforcing consistency and detecting errors in traditional programming languages.  This should remove the possibility that a simple typo could be totally catastrophic.&lt;br /&gt;&lt;br /&gt;The second way is enforcing a discipline where we re-state the same properties in different ways.  By doing this, we ensure that either a proof of one of the properties will fail, or we will introduce inconsistency between our encodings of the properties which will be rejected by the theorem prover.&lt;br /&gt;&lt;br /&gt;Finally, and although I've previously argued against it, &lt;span style="font-weight: bold;"&gt;be careful&lt;span style="font-style: italic;"&gt;.  &lt;/span&gt;&lt;/span&gt;This is very different to the suggestion that you just "be careful" when constructing a program in a traditional language.  This is about a set of safety properties.  You can only guarantee that your program is correct with respect to the specification that you have written down.  If you are falsely claiming confidence in your software where none exists, then that would be just as catastrophic in any environment.  I really think it's worth saying here that we are way off into the land of the abstract.  Even in a complex specification, it would be very difficult to encode things in such a way that you succeed in proving a whole set of properties, none of which provide the guarantees you think you are making (the proof is likely to fail if you have an inconsistent model of what you think you are proving and what you are actually proving).  This is a much stronger safeguard against human error than is provided in any traditional programming environment available today.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;But what if the compiler is incorrect?&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;There are two basic counter-arguments to this.  Basically, in proving properties about software, you are either working with an idealised model of the software encoded into the theorem prover assuming some particular semantics for the target language and/or underlying hardware, or you are actually writing the software &lt;span style="font-style: italic;"&gt;inside&lt;/span&gt; the theorem prover, and then generating code once you have proven all the properties of interest.  In either case, you are somewhat at the mercy of compilers.  This is always the case though - any correct program could be made incorrect if there are faults in the compiler or the hardware.  You can take two approaches here - assume that the hardware and compiler work as expected (therefore invalidating your guarantees of correctness if they do not), or make use of the &lt;a href="http://portal.acm.org/citation.cfm?id=1250734.1250742"&gt;recent work&lt;/a&gt; on &lt;a href="http://scholar.google.co.uk/scholar?hl=en&amp;amp;q=certified+compilation&amp;amp;btnG=Search"&gt;certified compilation&lt;/a&gt; to at least remove the uncertainty of the compiler from the equation.  There is a whole different body of work on (hardware) &lt;a href="http://scholar.google.co.uk/scholar?hl=en&amp;amp;q=formal+verification+hardware+fault+tolerance&amp;amp;btnG=Search"&gt;fault tolerance&lt;/a&gt; in software.  If you're really getting to this level of safety, then you probably need to be an expert on these kinds of issues anyway.  Theorem provers for software verification are concerned with the verification of certain properties.  If validating your assumptions about the hardware and software environments are important, then you will need to do more work.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Is it worth it?&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;For some things, yes.  This kind of formal software verification is hard work.  It's very labour intensive, and it requires a great deal of effort.  That said, it is getting easier as the tools get better, the barrier to entry is getting lower.  I don't think it will be very long before we start seeing a lot more in the way of light-weight theorem proving taking place as a standard part of compilation.  In a world where software failure costs huge amount of money, it doesn't seem so remote a possibility that it will become practical and cost-efficient to apply formal verification to more software.&lt;br /&gt;&lt;br /&gt;--&lt;br /&gt;&lt;br /&gt;Personal plea:  I'm still looking for reasonably-priced accommodation for my partner and I in Copenhagen if anyone has any leads.  I'm eager to go start my PhD at &lt;a href="http://www.itu.dk"&gt;ITU&lt;/a&gt;, but need to find somewhere to live first.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4376947664270917518-3525843592997632734?l=plsadventures.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://plsadventures.blogspot.com/feeds/3525843592997632734/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://plsadventures.blogspot.com/2009/10/any-time-there-is-discussion-of-formal.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4376947664270917518/posts/default/3525843592997632734'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4376947664270917518/posts/default/3525843592997632734'/><link rel='alternate' type='text/html' href='http://plsadventures.blogspot.com/2009/10/any-time-there-is-discussion-of-formal.html' title='The bug-free myth: Why we trust verified software'/><author><name>Gian</name><uri>http://www.blogger.com/profile/04384217245734242968</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='26' src='http://1.bp.blogspot.com/_AhA71miQEj0/SoqxZcE7mlI/AAAAAAAAAAM/SCAy8ISor6w/S220/zoinks.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4376947664270917518.post-5635245235285523227</id><published>2009-10-02T13:34:00.007+01:00</published><updated>2009-10-02T14:56:15.638+01:00</updated><title type='text'>Insecure software is your fault</title><content type='html'>Security breaches in software are really just a special case of software failure, but it has the dubious honour of having a "community" build around it.  There is a cost associated with all software failure, but most people are willing to tolerate a certain amount of failure in their applications - as I'm often reminded whenever I discuss formal methods with other programmers - "mostly working is good enough".  I can understand this viewpoint (although I don't agree with it), however by this reasoning, what is the acceptable level of security vulnerability in software?&lt;br /&gt;&lt;br /&gt;You can tolerate your word processor crashing every few days, but is it acceptable for your entire customer database to be stolen once every few days?  How about your production server being rooted once a month?  A crippling worm attack once per year?  Where does the cost/benefit relationship come into play here?&lt;br /&gt;&lt;br /&gt;I don't believe that anyone would seriously advocate the position that a serious security breach once per year resulting in data loss or theft is "acceptable".  I doubt anyone wants to have to pay for system administrators to reinstall compromised machines once a month.&lt;br /&gt;&lt;br /&gt;The title of this article is "insecure software is &lt;span style="font-weight: bold;"&gt;your&lt;/span&gt; fault".  I use "your" to refer to &lt;span style="font-style: italic;"&gt;developers&lt;/span&gt;.  I'll expand on this towards the end of this article, but basically, I believe this to be the case because we have the tools to eliminate most kinds of software vulnerabilities, and yet they &lt;span style="font-style: italic;"&gt;still exist&lt;/span&gt;, because programmers continue to not employ any of these tools or techniques.  If you aren't a software developer, or you actually use the best tools and practices available to you to produce good software, then you can safely assume that I don't mean that insecure software is your fault&lt;span style="font-style: italic;"&gt;&lt;span style="font-style: italic;"&gt;&lt;span style="font-style: italic;"&gt;&lt;span style="font-style: italic;"&gt;.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;I should create some distinction here between two different types of software "insecurity":&lt;br /&gt;&lt;br /&gt;1) Flaws and defects caused by language-level features such as manual memory management or insufficient sanity checks on input that permit damage to data or unauthorised access at the same privilege level as the program itself.  SQL injection and some buffer overflow attacks would seem to be obvious examples of this kind of thing.&lt;br /&gt;&lt;br /&gt;2) Flaws or defects caused by deficiencies in design - i.e., the software is unable to be compromised itself, however it enables undesirable actions within the application itself.  Cross-site scripting attacks are the most obvious example of this - the browser itself remains uncompromised, but any data inside the browser is  basically at risk due to an inadequately designed security model.&lt;br /&gt;&lt;br /&gt;In the first class of vulnerabilities, there are language-based solutions.  Type safety guarantees are sufficiently strong to prevent most kinds of attacks that rely on runtime type violations to exploit flaws.  Removing manual memory management from languages reduces another huge class of possible errors.  Automatic bounds-checking is the third and final part of it.  Contrary to popular myth, the performance cost of these things is very small, and yet the benefit is huge.  You've just eliminated even the &lt;span style="font-style: italic;"&gt;potential&lt;/span&gt; to inadvertently expose yourself and your users to a huge class of security vulnerabilities that will cost time and money to fix.&lt;br /&gt;&lt;br /&gt;Claiming that you don't need any of these catastrophe-preventing measures because you're &lt;span style="font-style: italic;"&gt;careful&lt;/span&gt; is a weak excuse.  It's a weak excuse because every experience we've had shows us that being careful just isn't good enough.  Suggesting that any developer who has ever introduced a security flaw just wasn't careful enough is a hugely arrogant assertion.  If you accept that security vulnerabilities &lt;span style="font-style: italic;"&gt;can&lt;/span&gt; happen to people who are both smart and careful, then why can't they happen to you?  You have a responsibility to your users to be more than &lt;span style="font-style: italic;"&gt;careful&lt;/span&gt;, you have a responsibility to be &lt;span style="font-style: italic;"&gt;sure&lt;/span&gt;.  And using language-based features that remove a huge class of potential errors is a way to be sure.&lt;br /&gt;&lt;br /&gt;In some rare circumstances where performance concerns actually dictate that you must not use these features, &lt;span style="font-style: italic;"&gt;don't write a network-facing service&lt;/span&gt;.  The only place I suggest being &lt;span style="font-style: italic;"&gt;careful&lt;/span&gt; is in isolating potentially insecure software from the outside world.&lt;br /&gt;&lt;br /&gt;There is a small caveat to this - not all languages were created equal.  In some cases the design of the language may be sufficient to remove a large class of vulnerabilities (through automatic memory management, bounds checking and sensible typing strategies), however in the interpreted case, you are essentially at the mercy of the interpreter.  Your code may be perfectly sane, and the language may be sane, but the interpreter written in C isn't necessarily sane or safe.  I add this only as a caution, because the same could be said of the operating system or even the hardware platform - you're only as secure as the things you rely upon.  However, anecdotally at least, it's very rare that operating systems have vulnerabilities that are likely to be exploitable only within your application if your application is otherwise defect-free, whereas this isn't always the case with network-facing interpreters.&lt;br /&gt;&lt;br /&gt;The second class of errors I identified is errors caused by flaws in design.  These can involve things like leaving scope for protocol violations, insufficient security models and some downright stupidity.  Very few of these situations have an obvious "language-based" solution at this moment.  If your application-level security model is flawed, the language-based features aren't going to help.  If illegitimate users of the application are able to cause mischief without ever "exploiting", then the security model is flawed.  This could be as simple as leaving a "delete" action outside of some authenticated realm (allowing unauthenticated users to delete things they shouldn't be able to), or it could be that a specially crafted series of actions leads to an undesirable or inconsistent state.&lt;br /&gt;&lt;br /&gt;Whatever the specifics, these are all essentially &lt;span style="font-style: italic;"&gt;safety properties&lt;/span&gt;, as in "this software is not able to exhibit these undesirable behaviours".  We actually have ways of checking safety properties nearly automatically these days.  You can check various properties through &lt;a href="http://en.wikipedia.org/wiki/Model_checking"&gt;model checking&lt;/a&gt;.  You can build things inside theorem provers and prove safety properties.  Tools like &lt;a href="http://groups.csail.mit.edu/pag/daikon/"&gt;Daikon&lt;/a&gt; can even aid in the generation of various invariants through runtime analysis.  &lt;a href="http://secure.ucd.ie/products/opensource/ESCJava2/"&gt;ESC/Java2&lt;/a&gt; can check certain kinds of safety properties at compile time.  &lt;a href="http://mtc.epfl.ch/software-tools/blast/"&gt;BLAST&lt;/a&gt; works on C programs.  &lt;a href="http://research.microsoft.com/en-us/projects/specsharp/"&gt;Spec#&lt;/a&gt; can express various safety properties, and &lt;a href="http://scholar.google.co.uk/scholar?hl=en&amp;amp;q=static+checking+of+safety+properties&amp;amp;btnG=Search"&gt;there are hundreds more similar efforts&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;Basically what I'm getting at is that these tools really aren't impossible to use, and they can give you concrete guarantees where you might otherwise have just been guessing.  The main thing is the ability to ensure a globally consistent view of the implications of any design decision relating to security.  I've previously asserted that humans aren't very good at keeping globally consistent mental models of complex systems, hence why I (and many others before me) advocate mechanical approaches to ensuring global consistency.&lt;br /&gt;&lt;br /&gt;So the take-home point from this (rather long) article is that we actually have the tools &lt;span style="font-style: italic;"&gt;right now&lt;/span&gt; to eliminate most of the kinds of security vulnerabilities that exist.  The problem is that people aren't really using them in many places where they really should be.  The much-reported damage caused by viruses and worms and "crackers" and DDoS attacks and zombies are all the direct result of insecure software.  Sure, people who engage in those sorts of activities are criminals, but really, we're basically leaving our doors and windows wide open and yet we still seems surprised when somebody comes in and steals stuff.  If anything, I think we should all start to mentally substitute the phrase "damage caused by cyber-criminals" with "damage caused by irresponsible software developers".&lt;br /&gt;&lt;br /&gt;If you happen to be a "computer security professional" (scare quotes intentional), then perhaps you can educate me as to why anti-virus software and firewalls and chasing endless exploits through ad-hoc "penetration testing" is still regarded as the primary activity of the "security community", rather than actually making serious efforts towards preventing security defects in the first place.  I have my own pet theory (spoiler alert: $$$), but I'd be interested to hear more informed opinions.&lt;br /&gt;&lt;br /&gt;--&lt;br /&gt;&lt;br /&gt;As a side note, I can now announce that I've accepted a PhD position in the &lt;a href="http://www.itu.dk/research/pls/wiki/index.php/Main_Page"&gt;PLS group&lt;/a&gt; at the &lt;a href="http://www.itu.dk/"&gt;IT University of Copenhagen&lt;/a&gt;.  I'm currently looking for a place for my parter and I to live in Copenhagen from the start of November, so if anyone from Denmark reading this has any leads, please let me know!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4376947664270917518-5635245235285523227?l=plsadventures.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://plsadventures.blogspot.com/feeds/5635245235285523227/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://plsadventures.blogspot.com/2009/10/insecure-software-is-your-fault.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4376947664270917518/posts/default/5635245235285523227'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4376947664270917518/posts/default/5635245235285523227'/><link rel='alternate' type='text/html' href='http://plsadventures.blogspot.com/2009/10/insecure-software-is-your-fault.html' title='Insecure software is your fault'/><author><name>Gian</name><uri>http://www.blogger.com/profile/04384217245734242968</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='26' src='http://1.bp.blogspot.com/_AhA71miQEj0/SoqxZcE7mlI/AAAAAAAAAAM/SCAy8ISor6w/S220/zoinks.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4376947664270917518.post-3899501203216615861</id><published>2009-09-30T10:47:00.002+01:00</published><updated>2009-09-30T10:53:29.116+01:00</updated><title type='text'>Bletchley Park</title><content type='html'>Rare non-PL post, but I figure this one is worth it.&lt;br /&gt;&lt;br /&gt;I went to &lt;a href="http://www.bletchleypark.org.uk/"&gt;Bletchley Park&lt;/a&gt; last weekend, and had an absolutely fantastic time. There was a lot more there than I expected. It's an absolute bargain at £10, and it's only about 45 minutes on the train from Euston station in London. It's definitely worth doing one of the free tours, too. The guides are extremely knowledgeable and entertaining.&lt;br /&gt;&lt;br /&gt;Also located there is the &lt;a href="http://www.tnmoc.org/"&gt;National Museum of Computing&lt;/a&gt;, which was fascinating.  I took some pictures:&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://photos-f.ak.fbcdn.net/hphotos-ak-snc1/hs259.snc1/10626_140117849177_550119177_2401013_5799611_n.jpg"&gt;&lt;img style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer; width: 604px; height: 453px;" src="http://photos-f.ak.fbcdn.net/hphotos-ak-snc1/hs259.snc1/10626_140117849177_550119177_2401013_5799611_n.jpg" alt="" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;And one of me standing in front of a PDP-11 looking quizzical:&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;div style="text-align: center;"&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://photos-e.ak.fbcdn.net/hphotos-ak-snc1/hs279.snc1/10626_140117894177_550119177_2401020_172803_n.jpg"&gt;&lt;img style="margin: 0pt 10px 10px 0pt; float: left; cursor: pointer; width: 604px; height: 453px;" src="http://photos-e.ak.fbcdn.net/hphotos-ak-snc1/hs279.snc1/10626_140117894177_550119177_2401020_172803_n.jpg" alt="" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;So, if you are in the UK, or are planning a visit, definitely make the effort to get to Bletchley Park.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4376947664270917518-3899501203216615861?l=plsadventures.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://plsadventures.blogspot.com/feeds/3899501203216615861/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://plsadventures.blogspot.com/2009/09/bletchley-park.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4376947664270917518/posts/default/3899501203216615861'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4376947664270917518/posts/default/3899501203216615861'/><link rel='alternate' type='text/html' href='http://plsadventures.blogspot.com/2009/09/bletchley-park.html' title='Bletchley Park'/><author><name>Gian</name><uri>http://www.blogger.com/profile/04384217245734242968</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='26' src='http://1.bp.blogspot.com/_AhA71miQEj0/SoqxZcE7mlI/AAAAAAAAAAM/SCAy8ISor6w/S220/zoinks.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4376947664270917518.post-2534172664721300565</id><published>2009-09-24T14:24:00.008+01:00</published><updated>2009-09-24T16:34:26.381+01:00</updated><title type='text'>Two features that would improve my life</title><content type='html'>If you've ever seen &lt;a href="http://en.wikipedia.org/wiki/From_the_Earth_to_the_Moon_%28TV_miniseries%29"&gt;From the Earth to the Moon&lt;/a&gt;, you might remember the episode about the creation of the Lunar Lander.  Tom Kelly is shown standing outside throwing a red ball against a wall repeatedly while he tries to think of solutions to hard problems.  This scene has always been memorable to me because I think it is the only cinematic depiction of "problem solving" that has ever seemed familiar to me.&lt;br /&gt;&lt;br /&gt;I cite this scene because it echoes the way I problem solve.  I think a lot, and I do strange things in an undirected manner until I solve it.  When I'm actually creating software, I generally know how most of it is going to work before I even start typing.  Writing all the supporting code and boilerplate is not intellectually challenging - and I think this is the majority of the "programming" that a commercial programmer is called upon to do.&lt;br /&gt;&lt;br /&gt;In the midst of this programming-as-typing though, there are often hard problems that need to be solved.  These are usually the bits that I find myself actually needing to think about - they are usually hard problems to reason about, and I find that I "code around" them in many instances.  It's the kind of situation where you might write a stub and then promise yourself to come back and write it later, once you understand the problem (and its solution) a little better.&lt;br /&gt;&lt;br /&gt;A recent example for me - I was attempting to figure out how to do a novel kind of type inference over complex terms.  I wrote a "infer" function that takes a piece of a program and returns a typed piece of program.  And then I wrote some tests, and my implementation consists of just returning "unknown type".  Not very useful.&lt;br /&gt;&lt;br /&gt;I have a bunch of strategies I usually use when I'm trying to solve these kinds of problems.  I scribble on paper a lot, trying to contrive concrete examples that I can start to generalise into an actual algorithm.  That's fine at the early stages, and helps my understanding a lot.&lt;br /&gt;&lt;br /&gt;Another strategy is to use some sort of REPL (interactive environment) and try to write bits of code that get me closer to my goal.  I usually use MLton for development, so this usually means switching over to SML/NJ so that I can start to "sketch" in code.&lt;br /&gt;&lt;br /&gt;Often these two approaches are enough to get me most of the way to a solution.  If the problem is &lt;span style="font-style: italic;"&gt;really&lt;/span&gt; hard, it might not be.  In these cases, I generally wheel out my trusty theorem prover, &lt;a href="http://www.cl.cam.ac.uk/research/hvg/Isabelle/"&gt;Isabelle&lt;/a&gt;.  At this point, it becomes possible to re-implement the bits of supporting code that I need in order to express the problem, and then start trying to test my assumptions by actually proving properties about the kinds of functions I'm "sketching".  This forces me to make very concrete my assumptions and intentions, because a theorem prover refuses to make educated guesses.&lt;br /&gt;&lt;br /&gt;So, having gone through that entire process, I was thinking of something that would make my life better:  a language-aware prototyping environment.&lt;br /&gt;&lt;br /&gt;Let me explain.   I have very little tolerance (or desire to use) visual programming environments, or "CASE tools", or very dynamic languages that permit "rapid prototyping".  I'm talking about something that will help while I'm trying to solve hard problems embedded within lots of other code during "real" software development.&lt;br /&gt;&lt;br /&gt;Basically I want the ability to access all of the supporting code and boilerplate that I've written in order to express my problem from within a prototyping environment.  Having to re-implement things inside Isabelle is a pain, particularly if I'm dealing with problems that are linked to the way things are represented.  I have to re-build all of the structures in another language just to be able to re-state the problem I'm trying to solve.&lt;br /&gt;&lt;br /&gt;So I want to be able to instantiate a prototyping environment exactly where I would currently just write a stub.  I want to be able to see visualisations of data easily.  I'd like to be able to do some of the things that I do with the REPL.  I'd like to be able to easily save and re-run test cases.  So far, so pedestrian.&lt;br /&gt;&lt;br /&gt;And this is where it starts to get more interesting.  I want to be able to write partial specifications.  In this kind of "exploratory programming" - I want to be able to start to apply "fuzzy" constraints until they become clearer in my head, mixing them with valid program code.  I think a good way to do this would be to employ some of the techniques from Qualitative Spatial and Temporal Reasoning.  These could be quite simple things, for example:  "this value gets bigger over time", "this event is going to occur before this one", and when dealing with complex data structures - "this value exists inside this other one".&lt;br /&gt;&lt;br /&gt;None of this would be directly executable, naturally, but with a range of visualisation options and some consistency checking, it would give you an ability to interact with a partially-written program in ways that you would otherwise start to do manually in a REPL.  Taking some of the QSTR statements and deriving some new facts from them would be useful too - if I have a bit of code saying that x gets bigger over time, then perhaps it could tell me that this means that y will get smaller over time.  It could draw me a pretty graph!  By being able to mix "real" code with these kinds of qualitative statements, I think you could start to get some interesting results.  I think the trick would be to integrate it into a compiler so that it's not necessary to do this in a "dynamic" way - the stuff that is not directly executable just gets compiled into some code that implements the prototyping environment.&lt;br /&gt;&lt;br /&gt;That's the first feature that I think would improve my life.  It would serve as a (metaphorical) "red ball" to throw against the wall while you try to solve a hard problem.&lt;br /&gt;&lt;br /&gt;The second is a fairly simple extension:  generate tests, specifications and documentation based on this prototyping process.  By including your compiler in the prototyping process, it seems like you are actually starting to capture a whole bunch of useful information.&lt;br /&gt;&lt;br /&gt;If I specified that A comes before B, then surely there are some useful tests (what happens if B comes before A, for example?).  If I made some statement about the depth of a tree versus the number of iterations through a piece of code, then that might be an interesting test or profiling metric.&lt;br /&gt;&lt;br /&gt;From a documentation standpoint, it seems like you could generate some useful documentation that provides insight into how a "difficult" piece of code actually works, because hopefully it would be possible to capture some of the rationale and the intuitive problem solving process that led to the eventual solution.  By keeping track of the revisions made inside the prototyping environment, you essentially have a record of the "story" of the code as it becomes more concrete.&lt;br /&gt;&lt;br /&gt;While it would probably be a reasonable amount of work, it's certainly not impossible.  I believe the minimal set of functionality that would actually make this a useful feature would be quite small.  I think this could be a "killer app" for some programming language - as long as the rest of the language was sane, I think I could be convinced to make a switch.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4376947664270917518-2534172664721300565?l=plsadventures.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://plsadventures.blogspot.com/feeds/2534172664721300565/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://plsadventures.blogspot.com/2009/09/two-features-that-would-improve-my-life.html#comment-form' title='6 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4376947664270917518/posts/default/2534172664721300565'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4376947664270917518/posts/default/2534172664721300565'/><link rel='alternate' type='text/html' href='http://plsadventures.blogspot.com/2009/09/two-features-that-would-improve-my-life.html' title='Two features that would improve my life'/><author><name>Gian</name><uri>http://www.blogger.com/profile/04384217245734242968</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='26' src='http://1.bp.blogspot.com/_AhA71miQEj0/SoqxZcE7mlI/AAAAAAAAAAM/SCAy8ISor6w/S220/zoinks.jpg'/></author><thr:total>6</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4376947664270917518.post-3961425982353673974</id><published>2009-09-14T12:03:00.007+01:00</published><updated>2009-09-14T12:58:48.298+01:00</updated><title type='text'>Pigeons and Programming: Superstitious behaviour</title><content type='html'>You may be familiar with B.F. Skinner's experiments with pigeons, published in 1948.  Basically, pigeons were kept in cages and fed on random schedules.  This led to the pigeons developing some rather bizarre behaviour - turning in clockwise circles, attempting to "lift" an invisible bar, and various other odd actions.  Skinner's conclusion was that this kind of reinforcement led to a kind of &lt;span style="font-style: italic;"&gt;weak association&lt;/span&gt; between whatever the pigeon was doing when food was delivered and the "reward" of getting food.  The eventual conclusion was that these pigeons were displaying a kind of "superstitious behaviour", which led them to repeat actions that resulted in reward in the past, despite there being absolutely no causal effect.&lt;br /&gt;&lt;br /&gt;Some psychologists object to extending these results to an explanation for superstitious behaviour in humans, but I believe the link between reinforcement and repeated behaviour is relatively well-demonstrated.&lt;br /&gt;&lt;br /&gt;What's the point of this pigeon story?  I think it applies to programming.&lt;br /&gt;&lt;br /&gt;Having a poor mental model of the semantics of a programming language leads to all sorts of bizarre superstitious behaviour.  You only have to read &lt;a href="http://www.thedailywtf.com/"&gt;&lt;span class="blsp-spelling-error" id="SPELLING_ERROR_0"&gt;thedailywtf&lt;/span&gt; &lt;/a&gt;for a little while to see all kinds of examples of the programming equivalent of throwing salt over your shoulder and turning around three times.  Let me stretch the analogy a little further:&lt;br /&gt;&lt;br /&gt;Certain programming structures lead to success in a task.  This success leads to a kind of reinforcement of the techniques used to reach that solution.  This "reinforcement" could be compiling, passing tests or running successfully.  Similarly, there is a kind of negative reinforcement schedule - compile errors, failing tests or &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_1"&gt;runtime&lt;/span&gt; failures.&lt;br /&gt;&lt;br /&gt;It's important to form a distinction at this point between simple learning-from-experience and the kinds of superstitious behaviour that I'm arguing occur in programming.  Let me give an example - having a poor knowledge of a given language, let's say that I construct a program and I apply techniques A, B, C and D.  These could be anything, really - naming schemes for variables, grouping of sub-expressions with parentheses, adherence to particular patterns, or anything else.  It may be that only A and C were sufficient in order to have the program compile and run.  Unfortunately, I've also included B and D.  Next time I come to write a program, if I continue applying B and D, these become a kind of entrenched superstition.  They serve no purpose, except to make reading code more difficult for programmers who do not share your superstitions.  This is still just learning from experience though - this isn't inherently "superstitious" - it's just a school of programming associated with throwing everything at a wall and seeing what sticks.&lt;br /&gt;&lt;br /&gt;Now, here's the crux:  it is &lt;span style="font-style: italic;"&gt;inconsistent application&lt;/span&gt; of "reward" that leads to superstitious behaviour.  This inconsistency exists in two basic forms in programming languages.  The first is when there is a lot of hidden behaviour.  When there are a lot of hidden assumptions, it becomes very difficult to discern &lt;span style="font-style: italic;"&gt;why&lt;/span&gt; your program works.  This is where the compiler or framework takes on a "magical" quality, whereby the rules governing success or failure are so complex as to seem almost random.&lt;br /&gt;&lt;br /&gt;The second opportunity for inconsistent application of reward is late failure.  I find this kind of behaviour is regularly present within the high-level, dynamically-typed, interpreted languages.  Where the goals of these languages emphasise being accessible and intuitive for beginners, this actually equates in many cases to "keep running until the error is completely fatal".  Where failure occurs well away from the actual source of the error, diagnosing and correcting errors can quickly become an exercise in near-random mutation of source code.  When cause-and-effect become very difficult to predict, then changing one part of the system and having another piece break (or start working) becomes a kind of reinforcement schedule.  This has two dangerous impacts: 1) it leads to unnecessary error handling in places there shouldn't be error handling.  The hideous global "try/catch" block wrapping an entire program is a symptom of failure occurring much later than it should.  2) programmers become timid about touching certain parts of a system.  This leads to more "magical thinking", whereby a piece of the system not only is difficult to understand, but &lt;span style="font-style: italic;"&gt;should not be attempted to be understood&lt;/span&gt;.  This makes it almost impossible to refactor or grow a system in a structured way.  There will always be bits of "magical" code waiting to break unexpectedly.&lt;br /&gt;&lt;br /&gt;So of the two sources of inconsistency, the first is the easiest to address.  This basically just corresponds to "learn enough about your environment so that it no longer appears magical":&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Test your assumptions regularly - Do you &lt;span style="font-style: italic;"&gt;really&lt;/span&gt; need those extra parentheses?  If you're not sure, try it without.&lt;/li&gt;&lt;li&gt;Be a square: read the manual&lt;span style="font-style: italic;"&gt; - &lt;/span&gt;it seems incredibly dull, but it's actually worth understanding your language of choice from the ground up.  You may be a good enough programmer to pick up languages just by reading a bit of source code and a few ad &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_2"&gt;hoc&lt;/span&gt; tutorials, but there is something to be said for the systematic approach to learning a language - namely, the hidden behaviours of the environment become considerably less magical.&lt;/li&gt;&lt;li&gt;Avoid magic - If you're considering web frameworks or programming languages or whatever, beware the phrase "it all just works".  If you don't understand how or why it works, and you don't believe you are willing to devote the effort to understanding how or why it works, then the benefits of any whiz-bang features are going to be eaten up by the maintenance and correctness issues you'll suffer as a result of superstitious behaviour.&lt;/li&gt;&lt;/ul&gt;The late failure issue is considerably harder to address without simply suggesting switching to a language that can enforce certain properties and that will fail &lt;span style="font-style: italic;"&gt;early&lt;/span&gt;.  By "early" I mean, "at compile time" or "immediately after the first error case".  I am always particularly suspicious of languages where things like silent type coercion and silent initialisation of never-seen-before variables occurs.  These are basically just error cases waiting to happen, but they won't be revealed until much, much later in the execution of the program.  When a typo at line 1 can cause an incorrect result at line 10,000, you probably have a problem.&lt;br /&gt;&lt;br /&gt;So the only advice I can offer, aside from "fail early" is to employ self-discipline in the absence of language-enforced discipline.  Attempt to build structures that will cause your program to fail early rather than deferring failure.  Avoid developing superstitious behaviours around certain pieces of code - if everyone spends a little time questioning their own attitudes towards development (and particularly towards &lt;span style="font-style: italic;"&gt;debugging&lt;/span&gt;), then maybe there will be a little less superstitious coding behaviour in the world.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4376947664270917518-3961425982353673974?l=plsadventures.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://plsadventures.blogspot.com/feeds/3961425982353673974/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://plsadventures.blogspot.com/2009/09/pigeons-and-programming-superstitious.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4376947664270917518/posts/default/3961425982353673974'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4376947664270917518/posts/default/3961425982353673974'/><link rel='alternate' type='text/html' href='http://plsadventures.blogspot.com/2009/09/pigeons-and-programming-superstitious.html' title='Pigeons and Programming: Superstitious behaviour'/><author><name>Gian</name><uri>http://www.blogger.com/profile/04384217245734242968</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='26' src='http://1.bp.blogspot.com/_AhA71miQEj0/SoqxZcE7mlI/AAAAAAAAAAM/SCAy8ISor6w/S220/zoinks.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4376947664270917518.post-6469986191188279850</id><published>2009-09-07T15:14:00.007+01:00</published><updated>2009-09-07T16:25:47.329+01:00</updated><title type='text'>Why programming language design is hard (and a few ways it can be made easier)</title><content type='html'>When designing a programming language and accompanying implementation, you are forced to keep thinking at three different levels - the way that the implementation works, the way that people will use it, and the requirements that influence design decisions.&lt;br /&gt;&lt;br /&gt;On the face of it, this is no different to any other software development activity.  In building a stock control system or any other common piece of software, I would be forced to think about the way it will be used (e.g. how users will enter items and how the people in the distribution centre will scan items as they leave), the implementation (e.g. what classes/functions/methods implement this functionality) and the requirements influencing the design (e.g. Mr. Big Boss of International Corporation really wants the software to be beige).&lt;br /&gt;&lt;br /&gt;The difference with programming languages and many other types of software implementations is that it is difficult to relate your requirements and design decisions back into the real world.  When designing a stock control system, a stock item is designed in such a way that it is some representation of a real item in a real warehouse.  The behaviour of the system attempts to reflect real processes carried out by real people.  You can go down and meet the people who dispatch the items if you want, and ask them all about their jobs.  Having these two very concrete things at your disposal makes implementation correspondingly easier, because you have a "real world" system against which to validate your software system.&lt;br /&gt;&lt;br /&gt;Programming language design lacks these analogous constructs.  The "behaviour" of the programming language basically corresponds to the semantics of the language, which you're probably still trying to nail down (it's unlikely that you have a written semantics, right?).  The way the users are going to make use of your programming language depends on what the semantics are.  While you can contrive examples and try to implement them in the language, you lack perspective, because the semantics that you have implemented are exactly the way they are because they correspond to your understanding of the semantics.  You're probably the person worst-equipped to make judgements about the usability of your own creation.  This removes our two best tools for guiding an implementation; the ability to predict users' behaviour with respect to the system, and concrete things against which to verify our assumptions.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;A few solutions&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;1) Get feedback early and often&lt;/span&gt; - this is important in &lt;span style="font-style: italic;"&gt;all&lt;/span&gt; software development, but if you are designing a new programming language, you need to be talking to people all the time.  While some people you talk to will demonstrate objections on some sort of dogmatic ideological basis ("I don't like static type systems, and you have a static type system, so the entire language is junk"), explaining the semantics to other people &lt;span style="font-style: italic;"&gt;and&lt;/span&gt; gauging how much effort it requires for them to get a sense of the language will be very helpful in guiding future decisions.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;2) Know why you're designing a new programming language&lt;/span&gt; - novelty or fun are perfectly valid reasons for doing PL design, but never lose sight of the fact that almost nobody will use a new programming language unless it does something better/easier/faster than they can already do it in one of the 10 trillion programming languages in the wild today.  Think about the kinds of programs that you want the language to be able to express easily, and what kinds of programs will be more difficult to express.  You might be aiming to create a general-purpose programming language, but that doesn't mean it's going to be good at absolutely everything.  If you &lt;span style="font-style: italic;"&gt;do&lt;/span&gt; claim to be designing a general-purpose language, here's a good  thing to keep in mind at every point during your development: "How would I feel about writing this compiler in my new language?"&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;3) Write things down, often&lt;/span&gt; - Your ideas will probably change the more concrete your design gets.  As you start being able to interact with your design (because you have a partial implementation of an interpreter or compiler), you will probably change your mind.  Writing down a description (formal or informal) of your syntax and semantics will help keep you from completely contradicting your own stated goals.  If (as in point 2) you decided that you were designing a language that would be good at expressing programs of type X, and then you subsequently make some changes that make it very difficult to express those types of programs, then what was the point of your language again?  Keep referring to point 1.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;4) Write tests&lt;/span&gt; - however sceptical or enthusiastic you may be about test-driven development in other endeavours, I can't stress enough how useful it is for programming language design.  At the very least, keep a directory full of the test programs you write and automate some means of running these.  If you can write tests down to the statement level, do that.  You can never have too many tests.  As I said in point 3, your design is likely to change.  You need to be bold in refactoring and redesigning, otherwise your ability to change fundamental design decisions will suffer.  You'll end up building "hacks" into your syntax or semantics just because you really aren't sure if changing some assumption or construct will break everything.  This has the added benefit of providing you with a whole lot of sample code when you come to write documentation.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;5) Think about the theory&lt;/span&gt; - Don't be afraid of  theory.  Theory is your friend.  A lot of people have published a lot of papers on many facets of programming language design.  It's worth learning enough about the theory of programming languages to enable you to read some of this published output.  If you're stumbling around in the dark trying to figure out a way to statically enforce some property, or a more efficient register-allocation algorithm for code generation or just trying to figure out why your parser generator is complaining about your grammar, chances are that somebody has solved this problem before, and has expressed some solution to it in a published form.  You'll find it much easier to relate the constructs in your new language to other existing languages if you don't reinvent the terminology wheel too.  I can't recommend strongly enough &lt;a href="http://www.cis.upenn.edu/%7Ebcpierce/tapl/"&gt;Types and Programming Languages&lt;/a&gt; by Benjamin C. Pierce as an excellent starting point for all things theoretical in the programming languages world.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;6) Consider using &lt;a href="http://en.wikipedia.org/wiki/ML_%28programming_language%29"&gt;ML&lt;/a&gt;&lt;/span&gt; - I &lt;a href="http://www.cs.cmu.edu/%7Efp/courses/15411-f07/resources.html"&gt;wouldn't&lt;/a&gt; &lt;a href="http://www.cs.bu.edu/%7Ehwxi/academic/courses/CS525/Spring04/classpage.html"&gt;be&lt;/a&gt; &lt;a href="http://compilers.iecc.com/comparch/article/96-10-079"&gt;the&lt;/a&gt; &lt;a href="http://flint.cs.yale.edu/cs421/case-for-ml.html"&gt;first&lt;/a&gt; &lt;a href="http://compilers.iecc.com/comparch/article/05-04-075"&gt;to&lt;/a&gt; &lt;a href="https://launchpad.net/ubuntu/jaunty/+package/smlnj"&gt;note&lt;/a&gt; that ML is really good for writing compilers.  Obviously, your circumstances may be different to mine, but having written compilers and interpreters in a few different languages, I haven't found anything quite as good as ML at expressing the kinds of parsing, validation and translation activities that are integral to compiler implementation.  The type system helps prevent the majority of errors that you are likely to encounter while writing compilers.  Recursive data types express most internal compiler structures perfectly, and the presence of mutable storage and imperative I/O means that you don't need to jump through too may "functional programming" hoops in order to achieve some of the things for which those features are useful.  Also, the very existence of &lt;a href="http://www.cs.princeton.edu/%7Eappel/modern/ml/"&gt;Modern Compiler Implementation in ML&lt;/a&gt; by Andrew Appel (which I consider to be the best introductory text on compilers to date) should make you at least think about having a look before you decide on an implementation language for your next compiler.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;7) Don't be afraid of being the same&lt;/span&gt; - There is a tendency towards ensuring that your syntax and semantics are different from everything else in existence.  If language X has a feature that is similar to yours, do you really need to invent new syntax and terminology?  If something exists elsewhere, it's probably worth your time to figure out how it works and how it is expressed, and then re-implement it in a way that would be familiar to users of language X.  If you are actually interested in having people adopt your language, then you should reduce the learning curve as much as possible by only inventing new terminology and syntax where it is appropriate to do so (i.e., because it is either a novel feature, or there is some improvement that can be made by doing it differently).  If you're creating a new language, you've already established that there are things you can do differently or better, so it's not necessary to go overboard in differentiating yourself.  If it were, PHP would never have become popular in the presence of Perl.  Clojure and Scheme would never have survived the disadvantage of Lisp already existing, and so on.&lt;br /&gt;&lt;br /&gt;Having laid out all of these points, I'm certainly guilty of not taking my own medicine at times.  I've largely written this out because I'm currently going through a design and implementation process with &lt;a href="http://www.expdev.net/lope/"&gt;Lope&lt;/a&gt;, and I don't want to repeat my past mistakes!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4376947664270917518-6469986191188279850?l=plsadventures.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://plsadventures.blogspot.com/feeds/6469986191188279850/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://plsadventures.blogspot.com/2009/09/why-programming-language-design-is-hard.html#comment-form' title='9 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4376947664270917518/posts/default/6469986191188279850'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4376947664270917518/posts/default/6469986191188279850'/><link rel='alternate' type='text/html' href='http://plsadventures.blogspot.com/2009/09/why-programming-language-design-is-hard.html' title='Why programming language design is hard (and a few ways it can be made easier)'/><author><name>Gian</name><uri>http://www.blogger.com/profile/04384217245734242968</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='26' src='http://1.bp.blogspot.com/_AhA71miQEj0/SoqxZcE7mlI/AAAAAAAAAAM/SCAy8ISor6w/S220/zoinks.jpg'/></author><thr:total>9</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4376947664270917518.post-8209441298967291925</id><published>2009-09-03T16:30:00.004+01:00</published><updated>2009-09-03T17:15:31.228+01:00</updated><title type='text'>On the semantics of pseudo-code</title><content type='html'>I was reading a thread on &lt;a href="http://news.ycombinator.org/"&gt;HackerNews&lt;/a&gt; entitled "&lt;a href="http://news.ycombinator.com/item?id=802484"&gt;What language do you think in?&lt;/a&gt;" and I was fascinated by the responses.&lt;br /&gt;&lt;br /&gt;Many of those commenting claimed to think "in pseudo-code".  That got me thinking - what does it mean to think in pseudo-code?&lt;br /&gt;&lt;br /&gt;By the textbook definition, pseudo-code is some sort of idealised, abstract code with an informal syntax.  It serves as an aid to human understanding, making use of the abilities of humans to resolve ambiguity in a sensible way and "black box" a lot of minor details.  That all sounds fine.&lt;br /&gt;&lt;br /&gt;The catch for me comes in thinking about the semantics of this mental pseudo-code.  It still &lt;span style="font-style: italic;"&gt;has&lt;/span&gt; some semantics, I assume, otherwise it is likely to be entirely useless as a mental model.  Perhaps we think of a statement in pseudo-code for "sum all of the values in X":&lt;br /&gt;&lt;pre&gt;for each value k in X&lt;br /&gt;a = a + x&lt;br /&gt;&lt;/pre&gt;OK, so that seems a reasonable enough translation from an English language specification into some code.  We've left out some details like what the initial value of "a" is, and the syntax is pretty fluid, but it should be unambiguous to most readers.&lt;br /&gt;&lt;br /&gt;The problem is that this is only one step of nearly-mechanical translation from just writing code in a programming language.  There is a sneaky step of concretisation going on here, both in the framing of the statement, and the pseudo-code representation.&lt;br /&gt;&lt;br /&gt;First, in the framing of the statement, we've assumed a stored list of values exists and is called "X".  Even abstracting this away to "the list of input values" (referring to it by its purpose, rather than a label) is still commitment to a paradigm at this early stage.  On the language side, we've basically just taken the semantics of some reasonably ubiquitous "C-style" language (think C, Python, Ruby, whatever) and abstracted away some syntax.  This really doesn't seem very useful in terms of "abstract thinking".&lt;br /&gt;&lt;br /&gt;So is this really what people mean when they say "thinking in pseudo-code"?  If so, that's just really a coded way of saying "I mentally construct programs in an idealised version of language X", where 'X' is some language (or perhaps group of similar languages) that the programmer is familiar with.&lt;br /&gt;&lt;br /&gt;I'm really not a fan of early concretisation.  I can't think how many times I've been in a meeting discussing high-level requirements with a group of technical and non-technical people.  The conversation goes something like:&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;Non-Technical Person: We're really going to need X&lt;br /&gt;Me: Are you sure you need X?  Y seems to do everything you want equally well, and it might fit better with the established processes of the organisation.&lt;br /&gt;Non-Technical Person: The processes that are important are...&lt;br /&gt;Technical Person 1: ... we should do Z! because there will just be a couple of classes to add, and using language version 2.3.7 we can just use built-in serialisers to create an object store.&lt;br /&gt;Technical Person 2: No, 2.3.7 will require a kernel upgrade, we should use 2.3.6 and avoid that altogether.&lt;br /&gt;(...)&lt;br /&gt;Me: *head explosion*&lt;br /&gt;&lt;/blockquote&gt;Names and specifics (and a little dramatic effect) aside, I've really had this exact conversation more than once in more than one organisation.  What I've come to realise is that there are a lot of people who refuse to engage in a conversation in the abstract, and they have to mentally compile to code or implementation details in order to even engage with the conversation.  I think this is probably the result of people who genuinely &lt;span style="font-style: italic;"&gt;do&lt;/span&gt; think in pseudo-code in the way that I've described.&lt;br /&gt;&lt;br /&gt;Naturally, this is all speculative, because I'm not sure I could force myself to think in code if I tried.  I'm just not wired that way.  I need to have a high-level understanding of a problem (which allows me to mentally formulate a high-level solution - usually in terms of the same abstract concepts used to describe the problem), and then "programming" becomes a typing exercise, in which I mentally translate from a high-level computational model to the code I'm writing as I type it.&lt;br /&gt;&lt;br /&gt;So, I'm interested in hearing from people who &lt;span style="font-style: italic;"&gt;do&lt;/span&gt; claim to think in 'pseudo-code'.  Have I completely misrepresented what you mean when you say it?  Are you actually writing in a language with syntax in your head?  Do you close your eyes and see functions and objects and whatever else?  Similarly, if you don't think like that, have you observed others that do?&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4376947664270917518-8209441298967291925?l=plsadventures.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://plsadventures.blogspot.com/feeds/8209441298967291925/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://plsadventures.blogspot.com/2009/09/on-semantics-of-pseudo-code.html#comment-form' title='5 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4376947664270917518/posts/default/8209441298967291925'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4376947664270917518/posts/default/8209441298967291925'/><link rel='alternate' type='text/html' href='http://plsadventures.blogspot.com/2009/09/on-semantics-of-pseudo-code.html' title='On the semantics of pseudo-code'/><author><name>Gian</name><uri>http://www.blogger.com/profile/04384217245734242968</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='26' src='http://1.bp.blogspot.com/_AhA71miQEj0/SoqxZcE7mlI/AAAAAAAAAAM/SCAy8ISor6w/S220/zoinks.jpg'/></author><thr:total>5</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4376947664270917518.post-4936326062015210344</id><published>2009-08-31T19:58:00.008+01:00</published><updated>2009-08-31T21:44:19.984+01:00</updated><title type='text'>Making money from programming language design</title><content type='html'>I've had an interest in programming language design for a long time.  Lately, however, as I've been working on my &lt;a href="http://www.plsadventures.com/2009/08/lope.html"&gt;prototype bigraphs-based language&lt;/a&gt;, I've been pondering the economics of programming language design.  Not that I have ever been particularly interested in making vast sums of money from designing programming languages - I'm much more interested in solving interesting problems, but it's interesting that PL design is viewed as a bit of a poor cousin to other varieties of mad schemes.  You're a startup with a $100k/month burn rate designing a new Web 2.0 application that will simultaneously topple Facebook, Spotify and cure cancer?  Sure, sounds exciting!  You're designing a new programming language?  Oh, it'll never catch on.&lt;br /&gt;&lt;br /&gt;The lack of enthusiasm from those outside the PL design community probably stems from the fact that it is probably a bit of a lame duck in terms of earning money*.  I do it for fun and interest (and making the world a better place etc), but that argument doesn't fly with some people.&lt;br /&gt;&lt;br /&gt;&lt;small&gt;* I know, there are lots of other reasons.  The strange ways in which adoption and rejection of languages by users works would make me nervous if I were an investor.  Also, sometimes it's hard to get enthusiastic about conversations over the merits of particular syntactic constructs.&lt;/small&gt;&lt;br /&gt;&lt;br /&gt;So this post is dedicated to various ways that it might be possible to make money from programming language design (and implementation).  There is probably a large overlap here with the economics of software in general.  This is largely hypothetical, as I can't say that I've succeeded (or even tried) in making any significant amount of money from programming language design.  Similarly, I'm basically "thinking out loud" at any scale.  Some of these ideas might work to earn some beer money for one guy in a garage, while others might only be achievable by huge corporations with millions of dollars of contracts.  Either way, my point is to simply enumerate some ways in which programming language design and implementation itself could be financially rewarding.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;1) Commercial compilers&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;By "commercial compilers" I mean you create a single reference implementation that is "the compiler", and then you sell licenses for the compiler software.  This is probably not a serious option outside a few specialist domains, given the myriad of production quality compilers for most languages, almost all of which are free and open-source.  This might be an option if you are targeting some very specific (and probably very expensive) piece of hardware, especially if your language provides some value that the vendor-supplied implementation does not.  The "specific" part is probably the important part.  In the absence of other options, people become much more willing to part with money.  If they can just use the same open-source compiler for the same old language they've always used, they probably will.&lt;br /&gt;&lt;br /&gt;Incidentally, I am interested to know how the economics of the entire .NET thing has worked out.  I assume Visual Studio is still a commercial product?  It's been several thousand years since I looked at it.  I assume the incentive for the creation of C# was not to profit directly from the language, but rather to create a big vendor lock?  The existence of Mono creates an even stranger situation.  If you know more about this side of .NET than me, I'm eager to learn more.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;2) Free compilers, commercial add-ons&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;This is a pretty simple model.  Open source the compiler, then sell value-added IDEs, debuggers, accelerators, static analysis tools and anything else you can think of.  My understanding is that this is largely the route taken by Zend.  It's probably not a bad approach, weakly corresponding to the "the first hit is free" school of business employed by many entrepreneurs working on street corners.  In a situation where you control the language specification, and the implementation, then you are likely the best-positioned to create accompanying tools that "professional" users of the language are likely to want or need.  This permits people to become fully familiar with the language as a hobbyist, producing real (useful) software using the freely available tools, and then moving on to commercial programming environments where the cost of tools is hardly considered.&lt;br /&gt;&lt;br /&gt;Personally though, I'm pretty stingy when it comes to buying software that I don't really need.  Commercial IDE?  I don't use IDEs anyway.  Commercial debugger?  I'll do print-debugging if I have to.  The "our implementation is crap unless you buy X" stuff is pretty much an automatic way of making me not give your language a second thought.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;3) Sell support&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;You know, the old open source trick.  I have no interest in the debate as to whether this is commercially viable or not, but it seems to work for some people.  I'd rather not.  I assume this would essentially equate to some kind of commercial support, either becoming a producer of software in your language, or providing training and tool-support services.  This might extend to providing some kind of certification process for programmers.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;4) Domain-specific embeddings&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Domain-specific languages are hot right now.  I quite like the idea that a single language design could then be specialised into domains in exchange for money.  I think the key difference between making money from programming languages and software in general is that people are almost completely unwilling to pay for compilers or tools for general-purpose languages.  There's just no need in this day and age.  However, within a domain-specific context, it would appear that there may be more acceptance and willingness to pay for tools and support.  It might be that you can convince large company X to pay you to create a DSL for their particular domain, or it might simply be a matter of convincing users of your general-purpose language that their particular task would be easier/better/faster in some domain-specific variant that you sell.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;5) License the spec/controlled membership&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Form a consortium!  Charge bucketloads of money for interested parties to become members.  You probably need a lot of users with a lot of money (read: "enterprise") before this would cease to be a joke.  If you could get into this sort of situation, you're probably already making money from your language in one way or another.  Licensing the specification, certifying implementations, and simply collecting fees for pretty much anything sounds appealing.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;6) Write a book&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;I know even less about the economics of publishing than I do about the economics of programming languages, but I assume the fact that people continue to publish books mean that someone, somewhere is making money from them.  Writing a book about your language seems to be a good way to kill two birds with one stone (i.e. write good quality documentation, and possibly make some money if enough people buy it).&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;7) Ad Revenue&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;I doubt this is going to be enough to retire on, but programming language websites often have the advantage of traffic.  Hosting the tool downloads and the documentation can provide an incentive for a lot of repeat visitors.  It might not be much, but it's always nice when your hobby is able to buy you a pizza a couple times a year, right?&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;8) Conferences and speaking engagements&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;I'm not sure how large a community has to get before speaking engagements would be likely to take up more than a week in any given year.  Conferences are probably hit-and-miss in terms of revenue generation.  You might end up making a little bit of money, but you could similarly lose a whole bunch.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;9) Embedded runtimes/interpreters/compilers&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;If you could convince a manufacturer of some device (e.g, a mobile phone) to include your implementation of language X, then presumably you could negotiate royalties per-device.  I assume that some of the mobile Java implementations provide some sort of financial incentives for their creators?&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;10) ???&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Can anyone else think of ways in which programming language design (or implementation) can be used to generate revenue?  Let me know in the comments, because if you have a guaranteed get-rich-quick-scheme, I might have &lt;a href="http://www.expdev.net/lope/"&gt;a programming language to sell you&lt;/a&gt; :-p&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4376947664270917518-4936326062015210344?l=plsadventures.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://plsadventures.blogspot.com/feeds/4936326062015210344/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://plsadventures.blogspot.com/2009/08/making-money-from-programming-language.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4376947664270917518/posts/default/4936326062015210344'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4376947664270917518/posts/default/4936326062015210344'/><link rel='alternate' type='text/html' href='http://plsadventures.blogspot.com/2009/08/making-money-from-programming-language.html' title='Making money from programming language design'/><author><name>Gian</name><uri>http://www.blogger.com/profile/04384217245734242968</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='26' src='http://1.bp.blogspot.com/_AhA71miQEj0/SoqxZcE7mlI/AAAAAAAAAAM/SCAy8ISor6w/S220/zoinks.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4376947664270917518.post-2682983496734729595</id><published>2009-08-26T17:17:00.003+01:00</published><updated>2009-08-31T20:00:32.335+01:00</updated><title type='text'>Lope</title><content type='html'>I'm working on a programming language that follows the general principles of Bigraphs and bigraphical reactive systems.  For simplicity's sake, think term-rewriting with process mobility.&lt;br /&gt;&lt;br /&gt;I've written up something that may eventually become documentation for an implementation of this language at:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.expdev.net/lope/"&gt;http://www.expdev.net/lope/&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;It's still a work in progress, but I would definitely appreciate any thoughts or comments.  You can either comment here or email me using the address in the side-bar.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4376947664270917518-2682983496734729595?l=plsadventures.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://plsadventures.blogspot.com/feeds/2682983496734729595/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://plsadventures.blogspot.com/2009/08/lope.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4376947664270917518/posts/default/2682983496734729595'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4376947664270917518/posts/default/2682983496734729595'/><link rel='alternate' type='text/html' href='http://plsadventures.blogspot.com/2009/08/lope.html' title='Lope'/><author><name>Gian</name><uri>http://www.blogger.com/profile/04384217245734242968</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='26' src='http://1.bp.blogspot.com/_AhA71miQEj0/SoqxZcE7mlI/AAAAAAAAAAM/SCAy8ISor6w/S220/zoinks.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4376947664270917518.post-4304983920581496617</id><published>2009-08-25T13:11:00.004+01:00</published><updated>2009-08-25T14:43:19.411+01:00</updated><title type='text'>I laugh at your puny human 'objects'</title><content type='html'>Let's say that Earth is being invaded by aliens.  The last tatters of the US government (and a motley crew of scientists and caravan enthusiasts) are holed up in a secret government base in the desert, where a crashed alien ship is stored.  One of the dashing action heroes realises that he can engineer a computer virus to take out the alien shields that are preventing any meaningful counter-attack.  Our brave hacker whips up some code, which must be delivered to the alien mothership using the (now-repaired) alien fighter.  They sneak into the mothership undetected and upload the virus.&lt;br /&gt;&lt;br /&gt;At the moment of truth, as they are about to detonate a nuclear device and make their escape, the virus program throws an exception because the behaviour of the 'AlienMothership' class differed slightly from our hero's interpretation, and the aliens succeed in taking over the world.&lt;br /&gt;&lt;br /&gt;OK, so I ripped most of that off from &lt;a href="http://www.imdb.com/title/tt0116629/"&gt;Independence Day&lt;/a&gt;, but I think it goes to the heart of my least favourite technology:  objects.&lt;br /&gt;&lt;br /&gt;I'll freely admit that this is a bit of a rant.  You're entitled to disagree.&lt;br /&gt;&lt;br /&gt;Objects are a bad model for the universe.  There, I said it.  Object Oriented Programming (OOP) is oversold as a "natural" way of modelling things.  It really isn't.  Let's examine why:&lt;br /&gt;&lt;br /&gt;Let's start with encapsulation.  I have no problem with encapsulation, as such.  It's a great idea.  Except when it isn't.  When you have hidden (mutable) state, I think it's pretty much a terrible idea.  You define a nice tidy interface (or class definition, or whatever terminology you prefer), and then call some methods from another location.  At any given moment, that object is in some state, and the meaning of calling another method on that object could be completely different.  X.a() then X.b() may be completely different to X.b() then X.a().  This implied ordering is not really expressed anywhere.  There might be an in-source comment or a note in the documentation that mentions it, but really, you're on your own here.&lt;br /&gt;&lt;br /&gt;I'm not talking about &lt;span style="font-style: italic;"&gt;incorrect&lt;/span&gt; behaviour here either, where it would be possible to check (at runtime) that a call to X.a() never occurs after X.b().  I'm referring to a situation where both execution orders are perfectly valid, but have totally different meanings.  How can I resolve this?  Well, naturally, I just look inside the class and figure out how internal state is modified and what the behaviour will be depending on the internal state.  Whoops!  I'm basically back in imperative-land.&lt;br /&gt;&lt;br /&gt;To summarise, I think hiding behaviour inside objects makes no sense.  "Hiding" computation away is basically ignoring the point of the program.  Abstracting away irrelevant computation makes sense, but OOP takes this paradigm way too far, hiding away both irrelevant computation &lt;span style="font-style: italic;"&gt;and &lt;/span&gt;the computation we are interested in.&lt;br /&gt;&lt;br /&gt;I asserted earlier that I thought objects were a poor model for the universe.  I'll try to elaborate on this claim.&lt;br /&gt;&lt;br /&gt;The classic OO examples always feature things that are somewhat modular in their design, interacting in terms of clearly defined events, with some internal state.  For example, one might model a bicycle with wheels and a 'pedal()' method.  Or a 'Point' class that perfectly captures the fact that a point has x and y coordinates, with some methods to handle translation or rotation or any other interesting operation.&lt;br /&gt;&lt;br /&gt;But a 'point' is just data.  Why does it have behaviour?  In this case we're basically just using objects to control name spaces.  It's nice to be able to have the word 'point' appear in operations on points, and have all of  the functionality associated with a point in one place.  Great, so let's do that, but instead of hiding the state, let's have a 'Point' module with &lt;span style="font-style: italic;"&gt;no internal state&lt;/span&gt;, which gathers up all of the point-related functionality that operates on values of type 'point'.&lt;br /&gt;&lt;br /&gt;A bike won't do anything without a person to pedal it.  A bike makes good sense from an OOP point of view.  We can have some internal state, recording things about the bike.  It's position and maybe its speed and direction of travel.  The "pedal()" and "turnHandleBars()" methods alter this internal state.  But a bike exists in the real world.  The "world" records the position of the bike, not the bike itself.  You can't pick up a bike in the real world and ask it where it is, or where it is going.  A real bike stores none of this information, rather the &lt;span style="font-style: italic;"&gt;context&lt;/span&gt; in which the bike exists tells us all of that information.&lt;br /&gt;&lt;br /&gt;So when we model a 'Bike' in an OO context, we're actually talking about a bike in a particular context.  The claimed decoupling just hasn't happened.  It's unlikely that any natural implementation of a bike could be picked up and re-used in a context where different world-assumptions exist.  Perhaps we want to ride a bike on the moon!  It seems unlikely we'd be able to do that without going back and changing some internal 'gravity' parameter inside the object.  When we model the world this way, we are encoding all sorts of assumptions about the world into the object itself.&lt;br /&gt;&lt;br /&gt;If we do the 'right thing' here and remove any world-specific state from the Bike class, what have we got left?  In this case, essentially an empty object that simply records the fact that a bike exists.  Why do we need an object to do that?&lt;br /&gt;&lt;br /&gt;Then there's the question of active and passive behaviour.  A bike does not move itself.  A person does.  A method in the 'Person' class might call the 'pedal' method on the Bike class.  This is a relatively sane operation.  Calling methods on a Person becomes a much stranger kind of game.  Obviously this is a &lt;span style="font-style: italic;"&gt;modelling&lt;/span&gt; task, so we're abstracting away from the real world, but there is only one kind of method-calling, which models the interaction between a passive system and an active one.  Where is the thread of program control?  What about active objects calling active objects?  Where do 'utility' functions fit in?  They have no real-world equivalent, but they still get shoe-horned into the OO paradigm.  Don't even mention concurrency in this situation - that's just a horror show waiting to happen.&lt;br /&gt;&lt;br /&gt;This isn't just irrational OOP-hatred, by the way.  I used to do a lot of programming in Java and C++.  These days I use SML.  I've found it to be a vast improvement on the situation, and here is why:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Records, Tuples and recursive data types fulfil most of the "structured data storage" uses of objects.&lt;/li&gt;&lt;li&gt;The module system provides the kind of name-space control and "bundling" for which people use classes.&lt;/li&gt;&lt;li&gt;Referential transparency means you can understand easily the meaning and purpose of a given function that operates on data.  There is no "hidden state" which modifies this behaviour in unpredictable ways.&lt;/li&gt;&lt;li&gt;The "control" of the program follows one predictable path of execution, making mental verification considerably easier.&lt;br /&gt;&lt;/li&gt;&lt;li&gt;Important computation isn't inadvertently "hidden away".  If computation happens, it is because there is a function that does it explicitly.&lt;/li&gt;&lt;li&gt;Clear distinction between "passive" data and "active" computation.&lt;/li&gt;&lt;li&gt;No hidden state means it is much harder to unintentionally model the object of interest and its context together.&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;The only thing that object orientation does for most programs is to provide basic name-space control.  I don't think it significantly helps maintainability or readability.  There are much better ways to do name-space control than wheeling out a giant, unwieldy object system that everything has to be crammed into.&lt;br /&gt;&lt;br /&gt;I think the problem essentially exists as a result of static modelling of the universe.  If you freeze a moment in time, then sure, everything &lt;span style="font-style: italic;"&gt;does&lt;/span&gt; look like objects.  But we're interested in behaviour and time and computation, none of which are captured very well by bolting on some "behaviour" to an essentially static model of a system.  And sadder still, instead of putting the behaviour somewhere sensible, it gets wedged into the static models themselves!&lt;br /&gt;&lt;br /&gt;There are a few other things I really dislike.  Inheritance, for example.  this post is getting really long though, so I'll stop here with the comment: If I was fighting the alien hoards, it wouldn't be with objects.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4376947664270917518-4304983920581496617?l=plsadventures.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://plsadventures.blogspot.com/feeds/4304983920581496617/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://plsadventures.blogspot.com/2009/08/i-laugh-at-your-puny-human-objects.html#comment-form' title='11 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4376947664270917518/posts/default/4304983920581496617'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4376947664270917518/posts/default/4304983920581496617'/><link rel='alternate' type='text/html' href='http://plsadventures.blogspot.com/2009/08/i-laugh-at-your-puny-human-objects.html' title='I laugh at your puny human &apos;objects&apos;'/><author><name>Gian</name><uri>http://www.blogger.com/profile/04384217245734242968</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='26' src='http://1.bp.blogspot.com/_AhA71miQEj0/SoqxZcE7mlI/AAAAAAAAAAM/SCAy8ISor6w/S220/zoinks.jpg'/></author><thr:total>11</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4376947664270917518.post-3324726571538160458</id><published>2009-08-21T15:57:00.004+01:00</published><updated>2009-08-21T16:56:22.577+01:00</updated><title type='text'>Worrying trends in programming languages:  Javascript</title><content type='html'>Why is Javascript so popular?&lt;br /&gt;&lt;br /&gt;It seems that in certain circles, the answer to any question appears to be "Javascript!".  With the rise of large social networking platforms and browser-based "desktop-style" applications, it appears that more and more of what we do on a daily basis is likely to be powered by traditionally "web" technologies.&lt;br /&gt;&lt;br /&gt;The announcement of &lt;a href="http://googleblog.blogspot.com/2009/07/introducing-google-chrome-os.html"&gt;Google Chrome OS&lt;/a&gt; seems to suggest that at least some people believe that the desktop and the web should no longer be distinct at all.  This is totally worthy as a goal, and I like the idea of removing the distinction.  However the implications are a bit scary.&lt;br /&gt;&lt;br /&gt;Some people (me included) don't really like writing Javascript, but luckily there are several compiler projects that will generate Javascript from your non-Javascript code (Google Web Toolkit springs to mind immediately, though I'm sure there are many others).&lt;br /&gt;&lt;br /&gt;Javascript is an odd triumph in programming language terms.  It is almost universally available in modern web browsers.  It's truly "cross platform" in all the ways that Java really isn't, in that most implementations are reasonably standard across platforms, without the requirement to install a vendor-specific virtual machine or anything like that.&lt;br /&gt;&lt;br /&gt;But couldn't we do better?  Why on earth would we take a language that can be compiled to efficient native code and then compile it into a slow, interpreted language?  We appear to be ignoring almost all of the advances made in programming languages in the last 20 years.  By compiling to Javascript, we &lt;span style="font-style: italic;"&gt;throw away information&lt;/span&gt; that would permit efficient optimisation or static analysis for security.  It just seems insane!&lt;br /&gt;&lt;br /&gt;The situation is complicated further by the likes of Adobe's Flash and (more recently) Microsoft's Silverlight.  With Flash, I am basically instantiating yet another Javascript (well, Actionscript) environment inside my browser.  The goals are essentially the same.  If it's possible (and even desirable!) to start writing desktop-style applications in Javascript, shouldn't Javascript be able to do almost everything that Flash or Silverlight can?  Surely with a little thinking, we could do better.&lt;br /&gt;&lt;br /&gt;Well I'm perfectly aware of why Javascript has become as popular as it is, I don't see why it should continue to become more popular.  Embedding a virtual machine inside a web browser that would permit execution of some quasi-compiled form of Javascript would be one way.  Open up the floodgates and let other languages target the same VM directly.  With a bit of JIT compilation, it could probably even be an efficient way to execute some programs, within a browser-based security model.  You could even implement a version of the VM in Javascript itself for legacy browsers that don't implement said VM.&lt;br /&gt;&lt;br /&gt;But this still appears to be an imperfect solution.  Ignoring the problems associated with trying to encourage standardisation or adoption of web technologies, it's still wasteful and just replaces the question "why are we writing desktop applications in Javascript?" with "Why are we running desktop applications inside a web browser?"&lt;br /&gt;&lt;br /&gt;The basic technological need that Javascript fulfils very well is the ability to run untrusted code in a standard way, without needing to download or install anything.  It provides a standard way to display and interact with an application through the browser, while still allowing nearly limitless customisation.&lt;br /&gt;&lt;br /&gt;So putting on our programming language designer hats, what would a Javascript-replacement need to provide?&lt;br /&gt;&lt;ol&gt;&lt;li&gt;Security - the ability to run untrusted code in such a way that we need not worry about its source.  Similarly, we must be able to control the information it has access to, allowing it certain types of information but at a level far more restrictive than user-level operating system access control would provide.&lt;/li&gt;&lt;li&gt;Zero installation - We may want the &lt;span style="font-style: italic;"&gt;illusion&lt;/span&gt; of persistence, but we want to be able to access this application from any machine we may sit down at, without worrying about platform installation restrictions or where our data goes.&lt;/li&gt;&lt;li&gt;Uniform display mechanism - HTML still appears to be the best candidate here.  We still need to be able to render HTML, but I imagine if we could provide some native components that support hardware acceleration for things like video and 3D applications, we might kill two birds with one stone.&lt;/li&gt;&lt;/ol&gt;So, in my very armchair-philosophical way, I think that this can all be solved in one easy step:&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Download, compile and run anything presented within a web page&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;That's right!  Let's just run it.  It would need to be in some sort of bytecode form, but that bytecode would be designed in such a way that it is relatively easy for the browser (or operating system?) to transparently compile it to native code.  The application has the full privileges afforded to any other desktop application.  It can access and manipulate the DOM in the web browser by any means it wishes.  We could presumably provide a list of resources available to the application, so that if we want for it to be able to access an OpenGL or DirectX surface to do fast 3D, or display streaming video, this is fine.&lt;br /&gt;&lt;br /&gt;Now, before you write me off as a complete moron, I know, this would probably present some security problems.  We might not want to allow unrestricted access to our file system or memory to untrusted programs.  This is where I think we need one more addition to our system: &lt;a href="http://en.wikipedia.org/wiki/Proof-carrying_code"&gt;Proof-Carrying Code&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;The basic idea is that the bytecode that is embedded as a resource in a web page comes with a proof.  While &lt;span style="font-style: italic;"&gt;writing&lt;/span&gt; proofs is difficult, &lt;span style="font-style: italic;"&gt;checking&lt;/span&gt; proofs is computationally inexpensive and relatively simple.  The program provides a proof that it is well-behaved with respect to the restrictions we wish to place upon untrusted web-based programs, and our browser can verify that the proof does indeed guarantee that the bytecode is well-behaved.  You can think of the proof as a portfolio of evidence for the safety of executing the bytecode - the browser can inspect this evidence (and the bytecode) and establish that the security properties hold itself - we don't need to "trust" the proof at all.  As long as this proof-checking succeeds, we can run this program with total confidence in its inability to do anything bad to our system.&lt;br /&gt;&lt;br /&gt;However, as I said before, &lt;span style="font-style: italic;"&gt;writing&lt;/span&gt; proofs is difficult.  Except that in this case, our compiler should be able to do it largely automatically.  If you write your web-based program in a high-level language that provides no facilities for doing anything bad, then it should be relatively straightforward to automatically construct a proof that will satisfy the browser's proof-checker.  The difficulty associated with proving that the program is well-behaved is going to be largely proportional to the low-level constructs you wish to access.  It might not be viable to verify certain kinds of programs - I'm not sure, but at least our efforts would be going into figuring out how to construct proofs automatically, rather than figuring out how to force everything we want to do with computers into the limitations of an antiquated scripting language.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4376947664270917518-3324726571538160458?l=plsadventures.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://plsadventures.blogspot.com/feeds/3324726571538160458/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://plsadventures.blogspot.com/2009/08/worrying-trends-in-programming.html#comment-form' title='5 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4376947664270917518/posts/default/3324726571538160458'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4376947664270917518/posts/default/3324726571538160458'/><link rel='alternate' type='text/html' href='http://plsadventures.blogspot.com/2009/08/worrying-trends-in-programming.html' title='Worrying trends in programming languages:  Javascript'/><author><name>Gian</name><uri>http://www.blogger.com/profile/04384217245734242968</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='26' src='http://1.bp.blogspot.com/_AhA71miQEj0/SoqxZcE7mlI/AAAAAAAAAAM/SCAy8ISor6w/S220/zoinks.jpg'/></author><thr:total>5</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4376947664270917518.post-921272576755137028</id><published>2009-08-20T16:58:00.006+01:00</published><updated>2009-08-20T20:12:42.512+01:00</updated><title type='text'>A prototype static analysis tool for observable effects</title><content type='html'>I've had a piece of code kicking around for a while that hasn't really been &lt;span style="font-style: italic;"&gt;doing&lt;/span&gt; anything, so I've decided to wrap it in a basic web front-end and unleash it upon the unsuspecting public.&lt;br /&gt;&lt;br /&gt;It's a tool for performing a very basic kind of specification checking for a simple ML-like language.  The idea is that you construct a specification consisting of *traces* of observable effects, and then the compiler can statically determine whether your program behaves according to this specification.  It does this by automatically inferring a process algebra-like &lt;span style="font-style: italic;"&gt;behavioural type&lt;/span&gt; for the program, and then using this as a representation against which to perform specification checking in a tractable way.&lt;br /&gt;&lt;br /&gt;An example:&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;OPERATIONS&lt;br /&gt;insertCoin&lt;br /&gt;button1&lt;br /&gt;button2&lt;br /&gt;takeDrink1&lt;br /&gt;takeDrink2&lt;br /&gt;coinReturnButton&lt;br /&gt;takeCoin&lt;br /&gt;&lt;br /&gt;NECESSARY&lt;br /&gt;&lt;br /&gt;SAFE&lt;br /&gt;insertCoin,button1,takeDrink1&lt;br /&gt;insertCoin,button2,takeDrink2&lt;br /&gt;&lt;br /&gt;FORBIDDEN&lt;br /&gt;insertCoin,_*,insertCoin&lt;br /&gt;&lt;/pre&gt;The &lt;span style="font-style: italic;"&gt;Operations&lt;/span&gt; section defines which operations are of interest to us.  Naturally our program could be producing any number of observable actions, but we may only be interested in specifying certain types of behaviours.  In this case, we're specifying a program that interacts with a vending machine.&lt;br /&gt;&lt;br /&gt;Each action (e.g. &lt;span style="font-style: italic;"&gt;insertCoin&lt;/span&gt;) corresponds to some library call in the language that produces a side-effect.  Because the language (which I will describe shortly) is "mostly functional" (in the style of ML), there is no mechanism to introduce new side-effects - they have to come from a built-in library function, or some foreign function interface.&lt;br /&gt;&lt;br /&gt;The (empty) &lt;span style="font-style: italic;"&gt;Necessary&lt;/span&gt; section is the strongest type of specification - it says "the program must be able to behave according to all of these traces, and must not be able to exhibit any other behaviour".  When it is empty (as in this example), it has no effect.&lt;br /&gt;&lt;br /&gt;The &lt;span style="font-style: italic;"&gt;Safe&lt;/span&gt; section specifies a weak liveness property, "the program must be able to behave according to at least one of these traces".  We have two separate traces defined in our &lt;span style="font-style: italic;"&gt;Safe&lt;/span&gt; section in this example, each corresponding to a "sensible" interaction between our program and the vending machine.  The compiler will halt and issue an error if the program is not able to behave according to any of the traces in a non-empty &lt;span style="font-style: italic;"&gt;Safe&lt;/span&gt; section.&lt;br /&gt;&lt;br /&gt;The final section is the &lt;span style="font-style: italic;"&gt;Forbidden&lt;/span&gt; traces.  These act as a general "pattern match" across all the possible behaviours that a program is able to exhibit.  In this case, we specify that we never want our program to be able to insert two coins into the machine.  The underscore (_) operator is a wild card, standing for any action that appears in the &lt;span style="font-style: italic;"&gt;Operations&lt;/span&gt; section.  The asterix (*) has the normal meaning one would expect from regular expressions, meaning "zero or more occurrences of this action".  So the final trace demands that there never be two occurrences of &lt;span style="font-style: italic;"&gt;insertCoin&lt;/span&gt; in &lt;span style="font-style: italic;"&gt;any&lt;/span&gt; of the possible program behaviours.&lt;br /&gt;&lt;br /&gt;Now, an implementation!&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;fun main x = let v = insertCoin () in (if rand_boolean() then ((button1 ());(takeDrink1 ())) else ((button2 ());(takeDrink2 ())))&lt;br /&gt;&lt;br /&gt;val _ = main ()&lt;br /&gt;&lt;/pre&gt;Now this implements our specification.  The web-based tool will report "specification satisfied", and will print a visual representation of the behavioural type that it has inferred from the program text.&lt;br /&gt;&lt;br /&gt;The tool is at: &lt;a href="http://www.expdev.net/tracespec/"&gt;http://www.expdev.net/tracespec/&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;I will upload the (as of yet) unpublished paper describing the internals of the system and hopefully some better examples and documentation later tonight.  Let me know if you would like to know more.&lt;br /&gt;&lt;br /&gt;The tool is pretty fragile at the moment.  The error reporting is very bad in most cases.  Let me know if you come across any totally weird behaviour, because part of this unleashing-on-the-public exercise is to try to figure out what is and isn't working, and how I can improve it, so I would appreciate any feedback.&lt;br /&gt;&lt;br /&gt;The idea (before you ask) is the encourage a &lt;span style="font-style: italic;"&gt;very light weight&lt;/span&gt; form of specification.  There is no execution going on here.  It is supposed to mimic the function of certain types of unit tests in a&lt;span style="font-style: italic;"&gt; &lt;/span&gt;completely static way.  The way it is implemented at the moment is a rough prototype.  Once I have proved to myself that this is a feasible kind of static analysis to perform, I may work on integrating it into a compiler for a general-purpose language.  There is no particular restriction on the source language, however it is most likely to be feasible for a strongly typed functional language.&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;The tool: &lt;a href="http://www.expdev.net/tracespec/"&gt;http://www.expdev.net/tracespec&lt;/a&gt;&lt;/li&gt;&lt;li&gt;An (unpublished) paper describing the system:  &lt;a href="http://www.expdev.net/tracespec/doc/tracespec.pdf"&gt;http://www.expdev.net/tracespec/doc/tracespec.pdf&lt;/a&gt; - A few things have changed, as I have been developing the system further to overcome the limitations described in the paper.&lt;/li&gt;&lt;li&gt;A few small examples: &lt;a href="http://www.expdev.net/tracespec/doc/examples/"&gt;http://www.expdev.net/tracespec/doc/examples/&lt;/a&gt;&lt;/li&gt;&lt;li&gt;A Git repository and source code will follow shortly.  In the mean time, you can contact me by adding gian.perrone to @ and gmail.com in order to get an email address.&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4376947664270917518-921272576755137028?l=plsadventures.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://plsadventures.blogspot.com/feeds/921272576755137028/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://plsadventures.blogspot.com/2009/08/prototype-static-analysis-tool-for.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4376947664270917518/posts/default/921272576755137028'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4376947664270917518/posts/default/921272576755137028'/><link rel='alternate' type='text/html' href='http://plsadventures.blogspot.com/2009/08/prototype-static-analysis-tool-for.html' title='A prototype static analysis tool for observable effects'/><author><name>Gian</name><uri>http://www.blogger.com/profile/04384217245734242968</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='26' src='http://1.bp.blogspot.com/_AhA71miQEj0/SoqxZcE7mlI/AAAAAAAAAAM/SCAy8ISor6w/S220/zoinks.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4376947664270917518.post-2847197996308998654</id><published>2009-08-18T11:40:00.001+01:00</published><updated>2009-08-18T12:43:50.960+01:00</updated><title type='text'>Hardware/Software Co-design</title><content type='html'>I'll preface this post by saying:  I don't know much about hardware.&lt;br /&gt;&lt;br /&gt;I have, however, spent a lot of time around hardware engineers.  As a very naive programmer, I generally assumed that what they did was programming with some "extra bits" added, though I've long since been corrected.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Hardware is fast&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;How fast?  For FPGA developments, I believe many applications can be 200 times faster.  I  believe even a reasonably inexperienced hardware designer can achieve speed ups of 50-100 times on many applications.  It's also worth thinking about power consumption.  The power consumption of an FPGA can often be a small fraction of the equivalent processing power required to hit the same performance constraints using commercially-available microprocessors.&lt;br /&gt;&lt;br /&gt;This all sounds wonderful.  Why aren't we all doing this?  Well, for me, the problem is basically me.&lt;br /&gt;&lt;br /&gt;I have neither the tools, nor the experience, to design non-trivial solutions to problems in hardware.  If I were to sit down and try to solve a problem in hardware, I would probably end up writing VHDL or Verilog in much the same way that I would write C or Java, which would result in really, really awful hardware, which would be unlikely to be efficient.  The maximum clock-rate would be very low, and finding a device big enough to fit my resulting hardware on could become an expensive undertaking.&lt;br /&gt;&lt;br /&gt;That said, I still really like the &lt;span style="font-style: italic;"&gt;idea&lt;/span&gt; of using hardware to solve problems.  While specialist hardware engineers will always have the job of designing high-performance hardware with tight performance constraints, is there any low-hanging fruit which we (as programmers) could target for using reconfigurable logic devices as part of our programs?&lt;br /&gt;&lt;br /&gt;The most obvious way of achieving this seems to be with the general idea of "hardware acceleration", where a special-purpose device (such as a PCI-X card with an ASIC or FPGA implementation of some algorithm) takes over some of the hard work from the CPU.  The device manufacturer provides some sort of software library that hides the hardware interaction behind nice high-level function calls that are familiar territory for programmers.&lt;br /&gt;&lt;br /&gt;These are great for specific applications, but I'm not aware of any "general purpose" cards that would permit programmers to integrate them into an arbitrary application without having some knowledge of hardware design.&lt;br /&gt;&lt;br /&gt;Perhaps the next obvious alternative is compiling software into hardware.  This is dangerous territory.  There have been so many claims of software-to-hardware compilers that I won't even bother to list them.  I'm sure you can find hundreds with a quick Google search.  The problem with &lt;span style="font-style: italic;"&gt;most&lt;/span&gt; of these is that they are actually just hardware semantics hidden underneath a veneer of some mainstream programming language.  A programmer familiar with the source language will still not be able to produce efficient hardware without knowing something about hardware.&lt;br /&gt;&lt;br /&gt;I think there could be two possible solutions, which I will sketch here:&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Component-based Approach&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Presumably we could get experienced hardware engineers to implement a wide variety of software-style components as efficient hardware.  We use the same approach as the domain-specific hardware accelerator manufacturers, hiding the complexity of hardware interaction behind some nice software libraries.  If you had a significant portion of the expensive operations from a programming language's standard library available as hardware components, it wouldn't take a particularly smart compiler to replace certain standard library calls with calls to the (functionally identical) hardware-wrapper library.&lt;br /&gt;&lt;br /&gt;But there are some problems that I can see.&lt;br /&gt;&lt;br /&gt;It is unlikely to be efficient in terms of communication, area or power consumption to simply push a whole lot of small, infrequently-used library calls into hardware.  There would need to be a lot of static (or runtime profiling?) analysis to determine where there are efficiencies to be gained.  The compiler would be forced to "schedule" components into the finite area available on the FPGA.  This would have implications for latency and throughput.  It might be necessary to buffer large numbers of calls before executing them.  The performance gains from doing operations in hardware are likely to be eaten up by the communication overhead resulting from having to move data from registers in the CPU (or from processor cache, or from main memory) to the FPGA.  Similarly in a traditional imperative setting, this is going to be an interruption of program control.  The program will have to stop and wait for the results from the FPGA.  The alternative is concurrency, but as I've previously said, &lt;a href="http://plsadventures.blogspot.com/2009/08/if-concurrency-is-easy-youre-either.html"&gt;concurrency is hard&lt;/a&gt;, and this is the type of heterogeneous task-level parallelism that is &lt;span style="font-style: italic;"&gt;really&lt;/span&gt; hard.&lt;br /&gt;&lt;br /&gt;So it appears that this approach would require putting a lot of smarts into the compiler.  We would need good performance heuristics, something resembling auto-vectorisation and automatic parallelisation, and likely an ability to trade-off latency for throughput.&lt;br /&gt;&lt;br /&gt;It all seems a little bit challenging, given the state of the art of each of those areas.  It would appear that some of the purely functional languages would be best-suited to this kind of approach, but even in the absence of side-effects it seems uh...  difficult, at best.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Language-based Approach&lt;br /&gt;&lt;br /&gt;&lt;/span&gt;So if we're not going to expect the compiler to perform a rather sophisticated brand of super-compilation on our programs, can we instead structure our programs (and programming languages) to make this kind of hardware/software co-design easier?&lt;br /&gt;&lt;br /&gt;I believe that this is one of the possible future directions for the &lt;a href="http://www.cs.waikato.ac.nz/research/jstar/"&gt;JStar&lt;/a&gt; project, although I'm not aware of the technical details in this case.  The data-structure-free approach may permit the kinds of analyses required to allow for efficient scheduling of components onto an external FPGA, so it definitely seems like an interesting avenue to keep an eye on.&lt;br /&gt;&lt;br /&gt;Another avenue has to be concurrent (and functional?) languages.  These are designed to express concurrent computation, which is essentially what we are trying to maximize, while minimising the presence of mutable state.  With mutable state largely out of the way, you are free to have copies of data lying around.  There are still questions as to how we maximise the efficiency (and indeed, even approach &lt;span style="font-style: italic;"&gt;predicting&lt;/span&gt; the efficiency).&lt;br /&gt;&lt;br /&gt;I'm aware of efforts to compile &lt;a href="http://en.wikipedia.org/wiki/Esterel"&gt;Esterel&lt;/a&gt; to hardware, though I have no idea to what degree this is practical or whether it is used in industry.&lt;br /&gt;&lt;br /&gt;It strikes me that a model-based approach to software development could be used to make the entire process a lot less painful.  By starting with a high-level model, it would be possible to refine towards both hardware and software implementations of different parts of the system.  Provided you could contrive acceptable heuristics for predicting performance, it seems like you could start to play this sort of game:&lt;br /&gt;&lt;br /&gt;I have a high-level abstract specification S.&lt;br /&gt;&lt;br /&gt;I have a set of (hardware) components:&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;H = {h1,h2,h3}&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;I have some software components that are synthesised from S, or are user-supplied:&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;W = {w1,w2,w3}&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;We want to choose a subset X ⊆ H and Y ⊆ W, such that:&lt;br /&gt;&lt;br /&gt;S ⊑ x1 || ... || xn || y1 || ... || yn&lt;br /&gt;&lt;br /&gt;Where '⊑' is some appropriate notion of refinement and '||' is parallel composition.  I'm not sure if this is more tractable than simply solving the problem outright for a given language, but it certainly seems like it would be considerably easier to verify the correctness of the decomposition of the program into software and hardware components, which would seem rather more difficult in this case than for a traditional compiler.&lt;br /&gt;&lt;br /&gt;Is anyone aware of any projects that attempt this kind of thing?  I'd be interested to hear from hardware and software people alike about their opinions on enabling programmers with no specialist hardware knowledge to use hardware in a manner that is both &lt;span style="font-style: italic;"&gt;efficient&lt;/span&gt; and reasonably familiar.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4376947664270917518-2847197996308998654?l=plsadventures.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://plsadventures.blogspot.com/feeds/2847197996308998654/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://plsadventures.blogspot.com/2009/08/hardwaresoftware-co-design.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4376947664270917518/posts/default/2847197996308998654'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4376947664270917518/posts/default/2847197996308998654'/><link rel='alternate' type='text/html' href='http://plsadventures.blogspot.com/2009/08/hardwaresoftware-co-design.html' title='Hardware/Software Co-design'/><author><name>Gian</name><uri>http://www.blogger.com/profile/04384217245734242968</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='26' src='http://1.bp.blogspot.com/_AhA71miQEj0/SoqxZcE7mlI/AAAAAAAAAAM/SCAy8ISor6w/S220/zoinks.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4376947664270917518.post-6281573442609595859</id><published>2009-08-17T11:33:00.000+01:00</published><updated>2009-08-17T11:35:43.912+01:00</updated><title type='text'>Housekeeping</title><content type='html'>I'm working on another post, but I thought I would briefly mention two things:&lt;br /&gt;&lt;br /&gt;1) I have turned on Google AdSense for this blog.  Let me know if the ads are particularly irritating.  I mostly turned it on for my own personal curiosity, rather than because I'm hoping to quit my day job using the ad revenue.&lt;br /&gt;&lt;br /&gt;2) I'm on Twitter - &lt;a href="http://www.twitter.com/gianp"&gt;http://www.twitter.com/gianp&lt;/a&gt; - I've actually had an account for a long time, but I'm actively using it now.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4376947664270917518-6281573442609595859?l=plsadventures.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://plsadventures.blogspot.com/feeds/6281573442609595859/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://plsadventures.blogspot.com/2009/08/housekeeping.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4376947664270917518/posts/default/6281573442609595859'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4376947664270917518/posts/default/6281573442609595859'/><link rel='alternate' type='text/html' href='http://plsadventures.blogspot.com/2009/08/housekeeping.html' title='Housekeeping'/><author><name>Gian</name><uri>http://www.blogger.com/profile/04384217245734242968</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='26' src='http://1.bp.blogspot.com/_AhA71miQEj0/SoqxZcE7mlI/AAAAAAAAAAM/SCAy8ISor6w/S220/zoinks.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4376947664270917518.post-2395339939426345711</id><published>2009-08-16T21:26:00.000+01:00</published><updated>2009-08-16T22:58:29.394+01:00</updated><title type='text'>Simple patterns for better arguments about programming languages</title><content type='html'>I somehow get myself into a lot of arguments about programming languages.  Perhaps it's because I talk about them a lot, and everybody has an opinion!&lt;br /&gt;&lt;br /&gt;Getting into arguments about programming languages generally means listening to a large number of value-based judgements until somebody gets bored and admits that language X really isn't as bad as he or she originally asserted, and yes, maybe language Y &lt;span style="font-style: italic;"&gt;does&lt;/span&gt; have some flaws.  This is largely counter-productive.  If we're going to argue about the best way to skin a cat*, let's argue about things we can &lt;span style="font-style: italic;"&gt;actually compare&lt;/span&gt;.  So I present in this post a number of more objective ways of talking about the merits of programming languages such that you might actually persuade somebody that your well-informed opinion is actually valid!&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:85%;"&gt;(* Please don't, that's cruel and horrible)&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;I often hear the word "powerful" used to describe the relevant merits of programming languages, as in "People use Ruby because it is more powerful".  This is often used in conjunction with lots of other words like "maintainable", "simple", "faster", "easier" etc etc.  These are all largely intangible qualities that could just as well be described by "I prefer language X".&lt;br /&gt;&lt;br /&gt;That said, maybe there is some value in comparing languages.  So wha&lt;span style="font-style: italic;"&gt;&lt;span style="font-style: italic;"&gt;&lt;/span&gt;&lt;/span&gt;t can and can't we measure?&lt;br /&gt;&lt;br /&gt;It's worth creating a distinction at this point between the programming language itself (i.e. syntax and semantics - the set of valid programs), and the eco-system that exists around the language, in terms of compilers, tools, libraries, user-bases etc.  Sometimes these overlap in a big way, especially in cases where there exists only a single implementation of a given language.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Comparison 1: Performance&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;We have some reasonably good metrics for "performance" of a given implementation for a certain application.  These are often pulled out when one party in a heated discussion about the merits of a programming language demands evidence to support claims of superiority or inferiority.  That said, as a metric for comparing &lt;span style="font-style: italic;"&gt;languages&lt;/span&gt;, they are pretty lousy.  Sure, it's interesting that you can write a compiler for some language X that outperforms a compiler for language Y for a certain application.  Big deal.  It &lt;span style="font-style: italic;"&gt;might&lt;/span&gt; say something about the ability for the program to express efficient solutions to problems, or it might just tell you that one compiler is better than another.&lt;br /&gt;&lt;br /&gt;Before I wave my hands and dismiss "performance" as a useful metric altogether, it &lt;span style="font-style: italic;"&gt;is&lt;/span&gt; true that some languages lend themselves to efficient implementations better than others.  The &lt;a href="http://shootout.alioth.debian.org/"&gt;Computer Language Shootout Game&lt;/a&gt; makes for interesting reading, even if it is not a completely scientific means of comparing whole languages.  Without beating up on anyone's favourite languages, languages that permit dynamic behaviour (usually requiring interpretation rather than compilation-proper) don't really lend themselves to static analysis and optimisation.  Sure, it happens, but it is added to the runtime cost of the application (because it is done, by necessity, at runtime).  I don't think anybody would really have a problem admitting that languages like PHP, Python, Perl and Ruby are unlikely to be considered in the same performance category as C/C++, ML, Fortran and a bunch of other languages that &lt;span style="font-style: italic;"&gt;do&lt;/span&gt; lend themselves to good compile-time optimisation and compilation to fast, native code (or something with comparable performance).&lt;br /&gt;&lt;br /&gt;So, take-home talking point number one:  &lt;span style="font-style: italic;"&gt;Language X lends itself to compile-time optimisation and efficient code-generation that is better than language Y&lt;/span&gt;'s.&lt;br /&gt;&lt;br /&gt;You need to be careful with this one though - just because people haven't written good optimising compilers for a language doesn't mean it's not possible or even difficult!  Sometimes it's just not a priority.  If you're going to use this talking point, you need to be able to substantiate it!  Think about things like:  presence (or absence) of very dynamic features, referential transparency, treatment of constants, and syntactic ambiguity resolvable only at runtime (&lt;a href="http://www.perlmonks.org/?node_id=663393"&gt;Perl, I'm looking at you&lt;/a&gt;).&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Comparison 2: Readability&lt;br /&gt;&lt;br /&gt;&lt;/span&gt;This is argument is abused regularly.  Arguments about readability often boil down to "Language X is different to language Y, which I am used to, therefore I find it harder to read!".  This is not a good argument.  You're basically just saying "I find it easier to use things that I am familiar with", which is uh, pretty obvious, really.  Some other person who uses X more than Y probably finds Y equally unreadable.&lt;br /&gt;&lt;br /&gt;There might be one tiny point for comparison here, although I would hesitate to use it unless you're very sure that you're correct.  It basically goes like this:&lt;br /&gt;&lt;br /&gt;How much information about the expected run-time behaviour of a program can I discern from the structure of a small piece of the program?&lt;br /&gt;&lt;br /&gt;To elaborate, what is the smallestt piece of the program that I can look at independent of the rest while still having at least some idea what it is meant to do?  Ca&lt;span style="font-style: italic;"&gt;&lt;/span&gt;n I look at just one line?  A few lines?  A page of code?  The entire program spread over multiple files?  This is probably equivalent to some kind of &lt;span style="font-style: italic;"&gt;modularity&lt;/span&gt; property, although I don't mean this in terms of objects or module systems.  I mean, "if something is true within this piece of code now, will it still be true no matter what else I add to the system?"&lt;br /&gt;&lt;br /&gt;I figure this corresponds to "the amount of things you need to keep in your head" in order to read code.  Smaller is probably better.&lt;br /&gt;&lt;br /&gt;Talking point two: &lt;span style="font-style: italic;"&gt;Language X requires the programmer to mentally store fewer things in order to read and understand the program than language Y.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;You'll need to be able to substantiate this if you're going to use it.  Think about things like scoping rules, syntactically significant names and &lt;span style="font-style: italic;"&gt;consistency&lt;/span&gt;.  Does this operator always mean the same thing?  No?  Does its meaning vary in some consistent way that is indicated by a (relatively small) context?  It doesn't?  Oh dear.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Comparison 3: Bugs!&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;&lt;span style="font-weight: bold;"&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;Don't get drawn into this one lightly.  Discussions about bugs can go on for days.  The sources of bugs, the causes of bugs, and what language design decisions influence the ability to detect (and indeed prevent) bugs.&lt;br /&gt;&lt;br /&gt;Being a formal methods geek, I'm pretty passionate about this point.  I think bugs in all their forms are bad things.  Some people see them as an unavoidable side-effect of producing software, much like mopping a floor will make the floor all wet.&lt;br /&gt;&lt;br /&gt;I think a well-designed language should allow a compiler to prevent you from making dumb mistakes.  I'm going to beat up on PHP here, because really, anybody who has used PHP pretty much loathes it.  My pet peeve:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;$someVeryComplicatedVariableName = doSomeStuff();&lt;br /&gt;&lt;br /&gt;print $someVeryComplicatedName;&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;Whoops!  I guess I didn't type that variable name correctly the second time.  Oh well, don't worry, PHP's default behaviour is to initialise it to an empty value that is coerced to an appropriate type for the context.  In this case, my program would print nothing at all.  Sure, it issues a &lt;span style="font-style: italic;"&gt;notice&lt;/span&gt; somewhere, but this isn't considered an error in PHP, because we don't declare things!  In this case, the design of the language prevents the compiler/intepreter from stopping us from doing something very dumb.&lt;br /&gt;&lt;br /&gt;Languages that lend themselves to compilers that enforce global consistency in some way are good at preventing errors.  I've already said in a previous post that I'm a fan of type systems that prevent dumb mistakes.  Obviously in a programming language there is always some conceptual gap between what we expect the compiler to do, and what it actually does.  To me a well-designed programming language is one that will prevent me from being inconsistent, and which will tell me if my understanding of a program differs significantly from the compiler's interpretation of it (manifested in the way I am writing my program, probably as a type error!).&lt;br /&gt;&lt;br /&gt;Finally, how many things are put in front of me that I have to get right in order for my program to be correct?  Oh, I have to do my own memory management?  So it would be bad if I got that wrong, right?&lt;br /&gt;&lt;br /&gt;Talking point:  &lt;span style="font-style: italic;"&gt;Language X is designed in such a way that a compiler or interpreter can detect and report a larger class of errors, and prevent more simple mistakes than one for language Y&lt;/span&gt;.&lt;br /&gt;&lt;br /&gt;Do your research on this one too.  Unfortunately there are very few studies with substantial findings on the types of errors that programmers make.  You'll need to restrict yourself to tangible classes of errors that are detectable in language X, but will not be detectable in language Y, by virtue of the design of the language.  Remember, a bad compiler might not report errors, but that doesn't necessarily mean it's impossible to detect them!&lt;br /&gt;&lt;br /&gt;The natural counter to argument is usually to propose an absurdly complicated way by which it might be possible to simulate the language-based features of language X in language Y at the application level.  For example, "We'll search-and-replace to replace every occurence of variable x with a function that validates its internal structure!"  At this point, you should probably just sigh and walk away, safe in the knowledge that you don't need to worry about any such absurd solutions in your own daily life.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Comparison 4: Eco-System Factors&lt;br /&gt;&lt;br /&gt;&lt;/span&gt;If you want to argue about libraries and user communities, be my guest.  If you're having an argument about programming languages, these things are probably important, but they really only tell us the way things are, not why they are like that or how they should be.  Arguing about how things should be is not particularly helpful, but similarly, "Everyone uses X so it must be better than Y" is a very unhelpful argument.  A lot of people get cancer, too.  Doesn't mean I'm lining up for it.&lt;span style="font-style: italic;"&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Talking point: &lt;span style="font-style: italic;"&gt;Language X has considerably more library support for &lt;application&gt; than language Y, and that is what I am interested in doing&lt;/span&gt;.&lt;br /&gt;&lt;br /&gt;Perfectly valid.  Just don't fall into the trap of suggesting that having more libraries means that one language is better than any other.  Most languages worth their salt actually provide some sort of foreign function interface, so the idea of a library being &lt;span style="font-style: italic;"&gt;completely&lt;/span&gt; unavailable in a language is pretty rare.  You might have to put a bit of work into it, but it's certainly not a limitation of the language just because someone wrote a library for X and not for Y.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Comparison 5: Non-comparisons&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;This is my "please avoid" list.  If you use any of these arguments in a discussion on programming languages, you should really look at what evidence you have, because you probably don't have any:&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Lines of Code - please, please avoid using this.  It's just not relevant.  If you use this, you have reduced an argument about the relative merits of programming languages to one about typing speed.&lt;/li&gt;&lt;li&gt;Elegance - I know what you mean, some solutions are elegant, but this really can't be quantified.  It's probably the &lt;span style="font-style: italic;"&gt;solution&lt;/span&gt; that is elegant, rather than the solution in a given language.  It could probably be expressed elegantly in almost any language by a programmer sufficiently familiar with the conventions of that language.&lt;/li&gt;&lt;li&gt;"&lt;span style="font-style: italic;"&gt;I don't need a language-based solution for that, we have social conventions&lt;/span&gt;".  No, no, no.  What you're saying here is that you are accepting that this problem is still a problem if people don't follow the conventions.  Guess what?  People are really bad at following rules.  Some people are just born rule-breakers.&lt;/li&gt;&lt;li&gt;"&lt;span style="font-style: italic;"&gt;Language X is too restrictive&lt;/span&gt;".  Really?  I find those pesky laws about not going on murderous rampages rather restrictive too.  If you can't qualify this statement with a very specific example of something that is both within the domain of the language,&lt;span style="font-style: italic;"&gt; and&lt;/span&gt; which genuinely can't be done using the features of the language in question, then you're just saying "&lt;span style="font-style: italic;"&gt;Language X doesn't let me do things exactly how I did them in language Y&lt;/span&gt;".&lt;/li&gt;&lt;/ul&gt;So I hope this all provides some food for thought.  I would love to hear of any other arguments that people resort to about programming languages that really annoy you.  Or perhaps I've missed some genuinely valuable comparisons that can be made?&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4376947664270917518-2395339939426345711?l=plsadventures.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://plsadventures.blogspot.com/feeds/2395339939426345711/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://plsadventures.blogspot.com/2009/08/simple-patterns-for-better-arguments.html#comment-form' title='8 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4376947664270917518/posts/default/2395339939426345711'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4376947664270917518/posts/default/2395339939426345711'/><link rel='alternate' type='text/html' href='http://plsadventures.blogspot.com/2009/08/simple-patterns-for-better-arguments.html' title='Simple patterns for better arguments about programming languages'/><author><name>Gian</name><uri>http://www.blogger.com/profile/04384217245734242968</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='26' src='http://1.bp.blogspot.com/_AhA71miQEj0/SoqxZcE7mlI/AAAAAAAAAAM/SCAy8ISor6w/S220/zoinks.jpg'/></author><thr:total>8</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4376947664270917518.post-2519494197483164110</id><published>2009-08-15T16:10:00.000+01:00</published><updated>2009-08-15T18:38:41.630+01:00</updated><title type='text'>Sources of inspiration and Interesting directions for programming languages</title><content type='html'>&lt;span style="font-family:verdana;"&gt;I try to keep abreast of new developments in programming languages, but it's pretty easy to fall behind.  There are so many small languages (I mean that in terms of user-base or general appeal) that may incorporate interesting features, but which are not all that appealing in of themselves.  I've decided to dedicate this post to something of a sketch of where I would like to take my own work in programming languages, and hopefully illustrating a few interesting sources of inspiration for anyone who has an interest in programming languages.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family:verdana;"&gt; I've tried to provide some links to projects which are currently working towards these goals in different ways, although it is unlikely to be complete.  Let me know if you think I've missed any significant developments in any of these areas.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;font-family:verdana;font-size:130%;"  &gt;Exotic Type Systems&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family:verdana;"&gt;I really like strong, static typing.  I'm a big fan of SML, for example, and the combination of type safety with good type inference means big benefits with very little effort.  I really don't want to get into the entire dynamic/static typing argument, but needless to say, I think that it's worth having your compiler do the work for you.  An inability to formulate programs in this way strikes me as a result of lack of imagination, rather than some inherent limitation.  Every time I hear somebody claim that they need the large-scale dynamic behaviour of language X in order to Y, I can't help think that I really don't need the bugs.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family:verdana;"&gt;But strong, static type systems are old news now.  More and more languages are integrating them and buying into the dream.  But there is even more that a type system can give you!  One promising direction:  dependent types.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family:verdana;"&gt;If you've used something like ML or Haskell before, dependent types are a reasonably intuitive extension.  Imagine the type signature for a function &lt;/span&gt;&lt;span style="font-style: italic;font-family:verdana;" &gt;append&lt;/span&gt;&lt;span style="font-family:verdana;"&gt; that takes two lists and joins them together:&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;val append : 'a list * 'a list -&amp;gt; 'a list&lt;/pre&gt;&lt;br /&gt;Brilliant.  We've cut out the possibility (at compile time) of anyone trying to pass lists of different types, or using the result of &lt;span style="font-style: italic;"&gt;append&lt;/span&gt; in such a way that they expect &lt;span style="font-style: italic;"&gt;append([1,2,3],[4,5,6])&lt;/span&gt; to return a list of strings.  It doesn't, however, tell us very much about the lists themselves, and that's where dependent types come in!&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;val append : 'a list&amp;lt;n&amp;gt; * 'a list&amp;lt;m&amp;gt; -&gt; 'a list&amp;lt;n+m&amp;gt;&lt;/pre&gt;&lt;br /&gt;Now we have added additional information about the length of the list.  Now a correct implementation of &lt;span style="font-style: italic;"&gt;append&lt;/span&gt; must implement this behaviour, and a programming cannot then attempt to use the result of  &lt;span style="font-style: italic;"&gt;append&lt;span style="font-style: italic;"&gt; &lt;/span&gt;&lt;/span&gt;in any way that is inconsistent with this behaviour.  Simiarly, we can start to check some interesting properties:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;val zip : 'a list&amp;lt;n&amp;gt; * 'b list&amp;lt;n&amp;gt; -&gt; ('a * 'b) list&amp;lt;n&amp;gt;&lt;/pre&gt;&lt;br /&gt;Which tells us (most importantly) that the two lists passed to the &lt;span style="font-style: italic;"&gt;zip&lt;/span&gt; function must be of the same length (&lt;span style="font-style: italic;"&gt;n&lt;/span&gt;).&lt;br /&gt;&lt;br /&gt;This is an absurdly simple treatment of dependent types, featuring only computation on integer values (in no particular syntax - this is just some ML-inspired syntax which one might use to express dependent types).  While projects like Dependent ML have restricted dependent type values to simple arithmetic on integers for tractability reasons, dependent types could extend to permitting any value (or expression yielding a value) in the language to be used as a value within a dependent type.  This kind of expressive power quickly leads to the ability to construct specifications of programs within the type system, which are then statically checked at compile time.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family:lucida grande;"&gt;Links of interest:  &lt;/span&gt;&lt;a style="font-family: verdana;" href="http://coq.inria.fr/"&gt;Coq&lt;/a&gt;&lt;span style="font-family:lucida grande;"&gt;, &lt;/span&gt;&lt;a style="font-family: verdana;" href="http://coq.inria.fr/"&gt;ATS&lt;/a&gt;&lt;span style="font-family:lucida grande;"&gt;, &lt;/span&gt;&lt;a style="font-family: verdana;" href="http://www.e-pig.org/"&gt;Epigram&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;font-family:verdana;font-size:130%;"  &gt;Visual Programming Languages&lt;br /&gt;&lt;br /&gt;&lt;/span&gt;I know, people have been going on about visual programming languages for years.  There is just nothing that appealing about them as a concept.  I wouldn't use one.  But for some reason, I still think they are a good idea, which makes me think that the problems might be in the execution, rather than the theory.  For me at least, there is just some kind of instinctual cringe factor about them though.&lt;br /&gt;&lt;br /&gt;That said, I think they are very likely to provide a good way to manage complexity going forward.  I have no idea what such a system might look like, but my experience working with process algebras has always been that one labelled transition system diagram is a whole lot more useful to my understanding than a page full of notation.  As the types of systems we want to describe get more complex (particularly with reference to mobile and distributed systems), I think visual languages may be the best way forward.&lt;br /&gt;&lt;br /&gt;Bigraphs are interesting in this way, as the visual representation is (according to Milner's original paper) considered primary, with any textual representation being second to the visual representation.  Bigraphs deserve a post all of their own (and they will likely get one soon), but I think if we can figure out a way to describe programs and systems visually such that unnecessary complexity is hidden, they might start to catch on.&lt;br /&gt;&lt;br /&gt;Links of interest:  &lt;a href="http://en.wikipedia.org/wiki/Visual_programming_language"&gt;The Wikipedia Article&lt;/a&gt; (see how many there are!?)&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;font-family:verdana;font-size:130%;"  &gt;Concurrent Programming Languages&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;I go on about concurrency a lot, but there's a good reason for that.  It's interesting!  Languages with process algebras hidden inside hold a special place in my heart.  The Occam Programming Language is an example of brilliant elegance (as was the Transputer for which it was designed) well ahead of its time.  In Occam, things are basically concurrent by default.  The programmer must &lt;span style="font-style: italic;"&gt;deliberately&lt;/span&gt; write sequential code, rather than everything being sequential by default, by virtue of the way things are written down as is the case in most languages.  This definitely appeals to me.&lt;br /&gt;&lt;br /&gt;Also of note are languages that implement &lt;a href="http://www.ps.uni-sb.de/alice/manual/futures.html"&gt;Futures&lt;/a&gt; in a clean way.  I'm sure many languages do, however my only real exposure to futures has been through Alice ML.  Futures are (I think) a nice way to think about certain concurrent computations, because they can essentially just be thought of as a special kind of laziness.&lt;br /&gt;&lt;br /&gt;Links: &lt;a href="http://en.wikipedia.org/wiki/Occam_%28programming_language%29"&gt;Occam&lt;/a&gt;, &lt;a href="http://www.erlang.org/"&gt;Erlang&lt;/a&gt;, &lt;a href="http://www.ps.uni-sb.de/alice/"&gt;AliceML&lt;/a&gt;, &lt;a href="http://www.mozart-oz.org/"&gt;Mozart/Oz&lt;/a&gt;, &lt;a href="http://clojure.org/"&gt;Clojure&lt;/a&gt;*&lt;br /&gt;&lt;br /&gt;* Note:  I'm seriously allergic to Lisp and all its relatives.  I'm not going to go into the reasons because I'll get flamed, but needless to say, I wouldn't use it.&lt;br /&gt;&lt;br /&gt;As great as these languages are, I'd like to see things go a lot further in this direction.  &lt;a href="http://www.itu.dk/research/pls/wiki/index.php/Bigraphical_Programming_Languages_%28BPL%29"&gt;Bigraphical Programming Languages &lt;/a&gt;seem very promising, although it is still early days.  Given that it's possible to &lt;a href="http://linkinghub.elsevier.com/retrieve/pii/S1571066107003763"&gt;represent the lambda calculus within bigraphs&lt;/a&gt;, it seems like this could take us a long way towards closing the gap between the way we create software now, and the way we would like to create software in the future.  Given the number of things that have processors in them these days, the notion of software existing as a process on a single machine seems very limiting.  Cloud computing is a good start, but what about totally ubiquitous computing?  Advances in programming languages are keeping up to date with the way people create software, but I'm interested in how we'll create software 50 years from now.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;font-family:verdana;font-size:130%;"  &gt;More Theorem Proving&lt;br /&gt;&lt;br /&gt;&lt;/span&gt; Writing software inside a theorem prover is a reasonably rewarding experience.  I haven't done a heck of a lot of it, and it's definitely hard work, but the benefits are obvious.  Instead of making a mental note or leaving a small comment about checking if a certain property of your program is true, you write down a lemma and prove it.  I think this style of designing software is definitely beyond the reach of most people (it's mostly beyond my reach for anything above a reasonably small size), simply because of the effort involved.  Given that most people refuse to engage with type systems that tell them they are wrong, I can't imagine programmers queuing up to be told they are wrong more often.&lt;br /&gt;&lt;br /&gt;That said, I think there is definitely room for making greater use of theorem proving "behind the scenes" in programming languages and compilers.  I think with tool support, this could be done in such a way that made little impact on the way programmers work.  More static analysis can only be a good thing, especially if it gives us the means to provide early feedback to programmers about properties that may or may not be true.  Going one step further towards tools that would ask the programmer to clarify their intentions could be very helpful.  Hopefully it could be done in a less irritating way than a certain office-assistant paper clip.  In any case, I think there is a lot to be gained by shrinking the gap between "programming" and "theorem proving" as separate activities.&lt;br /&gt;&lt;br /&gt;Interesting links: &lt;a href="http://coq.inria.fr/"&gt;Coq&lt;/a&gt;, &lt;a href="http://www.cl.cam.ac.uk/research/hvg/Isabelle/"&gt;Isabelle&lt;/a&gt;, &lt;a href="http://groups.csail.mit.edu/pag/daikon/"&gt;Daikon&lt;/a&gt;, &lt;a href="http://kind.ucd.ie/products/opensource/ESCJava2/"&gt;ESC/Java2&lt;/a&gt;&lt;br /&gt;&lt;span style="font-weight: bold;font-family:verdana;font-size:130%;"  &gt;&lt;br /&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4376947664270917518-2519494197483164110?l=plsadventures.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://plsadventures.blogspot.com/feeds/2519494197483164110/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://plsadventures.blogspot.com/2009/08/sources-of-inspiration-and-interesting.html#comment-form' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4376947664270917518/posts/default/2519494197483164110'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4376947664270917518/posts/default/2519494197483164110'/><link rel='alternate' type='text/html' href='http://plsadventures.blogspot.com/2009/08/sources-of-inspiration-and-interesting.html' title='Sources of inspiration and Interesting directions for programming languages'/><author><name>Gian</name><uri>http://www.blogger.com/profile/04384217245734242968</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='26' src='http://1.bp.blogspot.com/_AhA71miQEj0/SoqxZcE7mlI/AAAAAAAAAAM/SCAy8ISor6w/S220/zoinks.jpg'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4376947664270917518.post-2056072223687490771</id><published>2009-08-11T11:40:00.000+01:00</published><updated>2009-08-11T12:00:10.290+01:00</updated><title type='text'>If concurrency is easy, you're either absurdly smart or you're kidding yourself</title><content type='html'>Concurrency is hard.  I like to think I'm a pretty smart guy.  I'm definitely a very good programmer, and I've had plenty of theoretical and practical experience with concurrency.  I still find concurrency very hard.&lt;br /&gt;&lt;br /&gt;Concurrency is viewed as black magic, even by programmers who use it on a daily basis.  When people tell me that concurrency is easy, it generally means one of two things - They are super-humanly smart, or the way they are using concurrency is restricted to nice simple patterns of interaction to which we can apply some simple mental analogy. &lt;a href="http://en.wikipedia.org/wiki/Thread_pool_pattern"&gt;Thread pools&lt;/a&gt; are fairly easy to think about, for example.&lt;br /&gt;&lt;br /&gt;I'm talking about the kind of eye-wateringly difficult concurrency that arises when you are trying to extract some sort of task-level parallelism, or when you are working with loosely-coupled distributed systems (read "the general case").  The difficulty of concurrency is usually mentioned in the same breath as a list of common defects (e.g. race conditions, deadlocks, livelocks, etc), however I think it is important to remember that these things are all just outcomes that arise as a result of the difficulty, in the same way that hitting the ground is the probable outcome of falling off a building.&lt;br /&gt;&lt;br /&gt;The possibility for a programmer to introduce problems such as deadlocks and race conditions is not what makes concurrency difficult.  It's our total inability to mentally reason about concurrency that makes it difficult.&lt;br /&gt;&lt;br /&gt;Most programmers above a certain level are very good at conceptualising large, complex systems.  They can interrogate perceived weaknesses in a program before it is even written.  This is how good programmers manage to write programs that are largely free of defects and that generally work in the way that is expected.  Small-scale errors aside, a good high-level conceptual understanding of the system, coupled with an ability to mentally simulate the behaviour of any portion of a program provides programmers with most of the tools they need to construct good-quality programs.  The programmer is likely to know (at least at some high level) how the entire system is going to work any any given moment.&lt;br /&gt;&lt;br /&gt;Large-scale concurrency robs us of this ability to easily mentally simulate behaviour.  It makes our mental models unimaginably complex.  When it comes to concurrency, most of the intuitions we use to guide us through development simply break down.  I think of it as a sort of night-blindness.&lt;br /&gt;&lt;br /&gt;So what is different?  Humans are very good at causality.  When presented with the present state of the universe, plus some action, we can generally predict with a high degree of accuracy what the state of the universe will be after that action has been performed.  We can integrate huge numbers of factors into our calculation of the future state of the universe - gravity, wind, light, sound, heat.  If we start rolling a ball down a slope, we can imagine at any point in time approximately where that ball is going to be.  What we don't expect, or even attempt to predict, is an inter-dimensional portal opening, an alternative-reality version of ourselves stepping out of the void and setting the ball on fire.  And that's kinda concurrency for you.  We are good at imagining ourselves in a situation.  We are the observer in the code.  Unfortunately in a concurrent situation, we are the observer in many different "realities".&lt;br /&gt;&lt;br /&gt;Even with the knowledge that there is some algorithmic nature to the opening of dimensional portals during our ball-rolling activities, and that the alternative-reality observer's pyromaniac tendencies can moderated by holding up a certain flag, our mental models of the universe just don't extend to predicting this kind of behaviour.  If I were inclined towards evolutionary psychology, I would probably suggest that it's because as a species, we don't encounter these kinds of situations in our natural habitats very often.  We can predict cause-and-effect, but the idea of having to maintain a model of a second universe with its own rules as well as our own is just beyond the needs of most early humans, who were probably much more concerned with procreating and trying to avoid being eaten by tigers.&lt;br /&gt;&lt;br /&gt;It's not just the breakdown of causality that is a problem - it's also syntax.  I know I claimed I wasn't going to talk about syntax just two short posts ago, but it's kinda important here.&lt;br /&gt;&lt;br /&gt;When we think about some function "f", we can conceptualise it as something like:&lt;br /&gt;&lt;br /&gt;(S,x) --&lt;some&gt;-(some computation)---&gt; (S',y)&lt;br /&gt;&lt;br /&gt;Where "S" on the right represents the state of the world before the computation occurs, "x" is some input, "S'" is the state of the world after the comptuation has occurred and "y" is some output.  Given that most programming languages give us the ability to control scope, "S" is probably pretty small.  We usually don't even bother writing it down, leading to the more comfortable:&lt;br /&gt;&lt;br /&gt;f(x) = (some computation)&lt;some&gt;&lt;br /&gt;&lt;br /&gt;With an implied world-state being accessible to us before and after the computation.  Causality applies and all is well with the world.  We can mentally substitute any occurrence of "f(x)" with the behaviour of the computation.&lt;br /&gt;&lt;br /&gt;Now, consider something concurrent.  A given transition in a labeled transition system has one input state, one output state, plus a side-effect:&lt;br /&gt;&lt;br /&gt;x ---a---&gt; y&lt;br /&gt;&lt;br /&gt;Assuming some interpretation of this LTS using name-based synchronisation, any other occurence of "a" in the system is going to be causally related to this occurence of "a".  We're used to names just being names - they have no special meaning.  Well now "a" &lt;span style="font-style: italic;"&gt;does&lt;/span&gt; have a special meaning.  This is where it all starts to get a bit dicey.  We no longer have all of the information we need in front of us.  Causality has gone from being able to consider events locally within some frame of reference to needing to maintain a globally consistent mental model of all the ways that an "a" action could occur anywhere in the universe.  Reading this description of behaviour in isolation is no longer sufficient to tell us the whole story.&lt;br /&gt;&lt;br /&gt;This isn't just some problem with name-based synchronisation.  If anything, I think name-based point-to-point communication is considerably simpler than most of what we do in practice, and yet we are already at a point where our ability to rely on local information to give us clues as to how this piece of the system will behave has been taken away.&lt;br /&gt;&lt;br /&gt;Introducing locking gets a bit scarier yet again.  We're now required to know the state of our world at any given point in time, as well as needing to know the state of every other given world at any given point in time.  And the notions of time in each world are not the same.&lt;br /&gt;&lt;br /&gt;I think it's a wonder that we get concurrency right as often as we do.  I think it's as much by careful attention to detail and dumb luck as by good planning or being good at this type of thing.&lt;br /&gt;&lt;br /&gt;Now, before you jump up and down and tell me that "insert-your-favourite-concurrency-model-here addresses these problems", I know there are strategies for managing the complexity of these tasks.  I did concurrency theory for a living, so I know some approaches are better than others.  My point here was essentially to illustrate why it is important to manage complexity when doing concurrency.  It's definitely not just a case of "you can't do concurrency so that means you are stupid".  I'm arguing that pretty much everyone is bad at concurrency beyond some simple patterns of interaction that we cling to because we can maintain a good mental model by using them.&lt;br /&gt;&lt;br /&gt;In my next exciting installment, I'll talk about how I think we can improve the state of the art.  Spoiler alert: it involves &lt;a href="http://www.itu.dk/research/pls/wiki/index.php/A_Brief_Introduction_To_Bigraphs"&gt;Bigraphs&lt;/a&gt;, and some ideas that I hope I might be able to turn into a viable commercial proposition one of these days.&lt;br /&gt;&lt;br /&gt;Oh,  also, I recall reading a really good paper about the problems with syntax for representing terms in process algebras a few years ago, and I can't for the life of me remember what it was.  If anyone has any ideas what it might have been, let me know.  I'll give you a cookie.&lt;/some&gt;&lt;/some&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4376947664270917518-2056072223687490771?l=plsadventures.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://plsadventures.blogspot.com/feeds/2056072223687490771/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://plsadventures.blogspot.com/2009/08/if-concurrency-is-easy-youre-either.html#comment-form' title='14 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4376947664270917518/posts/default/2056072223687490771'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4376947664270917518/posts/default/2056072223687490771'/><link rel='alternate' type='text/html' href='http://plsadventures.blogspot.com/2009/08/if-concurrency-is-easy-youre-either.html' title='If concurrency is easy, you&apos;re either absurdly smart or you&apos;re kidding yourself'/><author><name>Gian</name><uri>http://www.blogger.com/profile/04384217245734242968</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='26' src='http://1.bp.blogspot.com/_AhA71miQEj0/SoqxZcE7mlI/AAAAAAAAAAM/SCAy8ISor6w/S220/zoinks.jpg'/></author><thr:total>14</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4376947664270917518.post-589257116388359861</id><published>2009-08-09T16:13:00.000+01:00</published><updated>2009-08-09T17:10:23.703+01:00</updated><title type='text'>When are we going to stop writing code?</title><content type='html'>Programmers, generally speaking, like writing code.  It seems obvious, but it's important to the point I would like make.&lt;br /&gt;&lt;br /&gt;Software defects arise from writing code.  Sure, there are classes of errors which arise as a result of programmers or stakeholders actually just getting requirements or specifications wrong, but mistakes in understanding or requirements only manifest once they are translated (by flawed, fallible humans) into something executable.&lt;br /&gt;&lt;br /&gt;So a very simple way to greatly reduce the number of defects that exist in our software seems to be to stop humans writing code.  By pushing more of the work into compilers and tools (which we can verify with a high degree of confidence), we reduce the areas where human error can lead to software defects.&lt;br /&gt;&lt;br /&gt;We're already on this path, essentially.  Very few people write very low-level code by hand these days.  We rely on compilers to generate executable code for us, which  allows us to work at a higher level of abstraction where we are more likely to be able to analyse and discover mistakes without needing to run the program.&lt;br /&gt;&lt;br /&gt;Similarly, type systems integrated with compilers and static analysis tools remove the burden on us as programmers to manually verify certain runtime properties of our systems.  Garbage collectors remove humans from the memory-allocation game altogether.&lt;br /&gt;&lt;br /&gt;See what I'm getting at?   We have progressively removed bits of software development from the reach of application developers.  Similarly, the use of extensive standard libraries packaged with mainstream programming languages (hopefully!) no longer requires programmers to create bespoke implementations of often-used features.  The less code a programmer writes, the fewer chances he or she has to introduce errors (errors in library implementations are a separate issue - however a finite amount of code re-used by many people is likely to be much better over time than a piece of code used by one implementation).&lt;br /&gt;&lt;br /&gt;The rise of various MVC-style frameworks that generate a lot of boilerplate code (e.g. Ruby on Rails, CakePHP, etc.) further shrinks the sphere of influence of the application developer.  In an ideal world, we  would be able to use all of these sorts of features to ensure that we essentially just write down the interesting bits of our application functionality, and the surrounding tools ensure global consistency is maintained.  As long as we can have a high degree of confidence in our tools, we should be producing very few errors.&lt;br /&gt;&lt;br /&gt;There is one basic problem: it doesn't go far enough.&lt;br /&gt;&lt;br /&gt;Despite their best intentions, Ruby on Rails and CakePHP are basically abominations.  I speak only of these two in particular because I've had direct experience with them.  Perhaps other such frameworks are not awful.  The flaws in both frameworks can essentially be blamed on their implementation languages, and the paradigm that governs their implementations.  Without any kind of type safety, and with very little to help the programmer avoid making silly mistakes (e.g. mis-spelling a variable name), we can't really have a high degree of confidence in these tools.&lt;br /&gt;&lt;br /&gt;Compilers, on the other hand, are generally very good.  I have a high degree of confidence in most of the compilers I use.  Sure, there are occasional bugs, but as long as you're not doing safety-critical development, most compilers are perfectly acceptable.&lt;br /&gt;&lt;br /&gt;So why are there still defects in software?  First, most new developments still use old tools and technologies.  If any kind of meritocracy was in operation, I would guess that very few new things other than OS kernels and time-critical embedded systems would be written in C, but that's simply not the case.  Many things that make us much better programmers (by preventing us from meddling in parts of the development!) are regarded as "too hard" for the average programmer.  Why learn how to use the pre-existing implementation that has been tested and refined over many years when you can just roll-your-own or keep doing what you've always done?  Nobody likes to feel out of their depth, and clinging tight to old ideas is one way to prevent this.&lt;br /&gt;&lt;br /&gt;Having done quite a bit of programming using technologies that are "too hard" (e.g. I'm a big fan of functional programming languages such as ML and Haskell), I think that if you use these technologies as they are designed to be used, you &lt;span style="font-weight: bold;"&gt;can&lt;/span&gt; dramatically reduce the number of defects in your software.  I know I criticised methodology "experts" in my previous post for using anecdotal evidence to support claims, but this isn't entirely anecdotal.  A language with a mathematical guarantee of type safety removes even the possibility of deliberately constructing programs that exhibit certain classes of errors.  They simply cannot happen, and we can have a high degree of confidence in their impossibility.  As programmers, we do not even need to consider contingencies or error handling for these cases, because the compiler will simply not allow them to occur.  This is a huge step in the right direction.  We just need more people to start using these sorts of approaches.&lt;br /&gt;&lt;br /&gt;So, the title of this post was &lt;span style="font-style: italic;"&gt;"When are we going to stop writing code?&lt;/span&gt;", and I ask this with some seriousness.  As we shrink the range of things that programmers are responsible for in software development, we shrink the set of software defects that they can cause.  Let's keep going!  I believe it is very nearly within our reach to stop writing software and start &lt;span style="font-weight: bold;"&gt;specifying&lt;/span&gt; software instead.  Write the specification, press a button and have a full implementation that is mathematically guaranteed to implement your specification.  Sure, there may be bugs in the specification, but we already have some good strategies for finding bugs in code.  With a quick development cycle, we could refine a specification through testing and through static analysis.  We can build tools for specifications that ensure internal consistency.  And as in the other situations where we have been able to provide humans with more abstract ways to represent their intentions, it becomes much easier for a human to verify the correctness of the representation with respect to their original intentions, without the need to run a "mental compiler" between code and the expected behaviour.  This means we can leave people to solve problems and let machines write code.&lt;br /&gt;&lt;br /&gt;That said, it's probably still not realistic for people to stop writing code tomorrow.  The tools that exist today are far from perfect.  We're still going to be forced to write code for the forseeable future.  We can get pretty close to the Utopian ideal simply by using the best tools available to us here and now, and in the mean time, I'm going to keep working on writing less code.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4376947664270917518-589257116388359861?l=plsadventures.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://plsadventures.blogspot.com/feeds/589257116388359861/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://plsadventures.blogspot.com/2009/08/when-are-we-going-to-stop-writing-code.html#comment-form' title='9 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4376947664270917518/posts/default/589257116388359861'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4376947664270917518/posts/default/589257116388359861'/><link rel='alternate' type='text/html' href='http://plsadventures.blogspot.com/2009/08/when-are-we-going-to-stop-writing-code.html' title='When are we going to stop writing code?'/><author><name>Gian</name><uri>http://www.blogger.com/profile/04384217245734242968</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='26' src='http://1.bp.blogspot.com/_AhA71miQEj0/SoqxZcE7mlI/AAAAAAAAAAM/SCAy8ISor6w/S220/zoinks.jpg'/></author><thr:total>9</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4376947664270917518.post-7832240011026780687</id><published>2009-08-09T16:05:00.000+01:00</published><updated>2009-08-09T16:10:33.717+01:00</updated><title type='text'>Welcome</title><content type='html'>Welcome.&lt;br /&gt;&lt;br /&gt;I've decided to start collecting some of my thoughts and musings on programming languages and semantics in this blog.  This is partially a reaction against repeated appearances of the writings of &lt;a href="http://www.rebelscience.org/"&gt;Louis Savain&lt;/a&gt; on various blogs and news sites that I read.  I have better things to do with my time than answer any of his over-reaching claims about programming in detail, but I disagree with most of them.  The primary reason for concentrating on semantics is that I find too many discussions of programming and software development quickly veer into discussions about syntax.  I find discussions about the merits of parentheses or value-judgements about "readability" really dull.  A supervisor of mine once memorably said:&lt;br /&gt;&lt;br /&gt;"Discussions about syntax are much like reading Shakespeare and then remarking on the excellent quality of the spelling".&lt;br /&gt;&lt;br /&gt;And how true that is.  Similarly, almost all of the (interesting) things that good programmers do relate to constructing abstractions, maintaining a consistent mental model of a complex system, and having a firm understanding of semantics, not of syntax.  To return to the previous analogy, knowing lots about the syntax of a particular programming language is about as useful in the construction of quality programs as knowing how to spell is use in the construction of quality plays.&lt;br /&gt;&lt;br /&gt;I use "quality" here without expanding on the definition.  I think most people could come up with a sufficient definition of "good quality software", which might include things like not crashing, not creating security vulnerabilities, and generally behaving in a sensible manner when faced with both valid and invalid inputs.  Software Engineering Tips has recently published an interesting article on &lt;a href="http://sites.google.com/site/yacoset/Home/signs-that-you-re-a-bad-programmer"&gt;what makes a bad programmer&lt;/a&gt;, and I find myself generally agreeing with most of the sentiments expressed.&lt;br /&gt;&lt;br /&gt;So, I hope to maintain a strong focus on formal methods, programming languages, concurrency and theoretical computer science, hopefully with only passing references to syntax or the particulars of implementations.  Similarly, I will likely stay away from the "human factors" involved in software development (e.g. methodology).  This is not because I think methodology is not important - quite the opposite is true.  I've just found that this is ground that has been endlessly covered in some detail by many authors, generally without a lot of hard evidence.  Such discussions generally devolve very quickly into the presentation of various anecdotes around vaguely-defined concepts of "productivity" and "ease of use", followed by some dubious conclusions that don't really follow from the scant evidence.  I don't claim that I will be objective or comprehensive.  Excellent sites such as &lt;a href="http://www.lambda-the-ultimate.org"&gt;Lambda The Ultimate&lt;/a&gt; already cover a broad range of topics in programming languages, logic and semantics with much more objective and substantiated viewpoints than I could ever hope to provide.  Instead, I will express opinions and a personal viewpoint on developments and topics within the area.&lt;br /&gt;&lt;br /&gt;Finally, a little bit about me.  I am a programmer by trade.  I have worked both on the sorts of everyday applications that are the bread-and-butter of the software industry, as well as research projects involving compilers and software quality.  I hope you enjoy my musings, and don't hesitate to comment if you disagree!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4376947664270917518-7832240011026780687?l=plsadventures.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://plsadventures.blogspot.com/feeds/7832240011026780687/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://plsadventures.blogspot.com/2009/08/welcome.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4376947664270917518/posts/default/7832240011026780687'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4376947664270917518/posts/default/7832240011026780687'/><link rel='alternate' type='text/html' href='http://plsadventures.blogspot.com/2009/08/welcome.html' title='Welcome'/><author><name>Gian</name><uri>http://www.blogger.com/profile/04384217245734242968</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='26' src='http://1.bp.blogspot.com/_AhA71miQEj0/SoqxZcE7mlI/AAAAAAAAAAM/SCAy8ISor6w/S220/zoinks.jpg'/></author><thr:total>0</thr:total></entry></feed>
