Saturday, February 02, 2008

What are programming languages for?

Somewhat to my dismay, a usenet post that I wrote six years ago has suddenly been getting a lot more attention than it deserves. But since it seems to have generated so much interest I decided to write a followup to clarify some of the confusion that the original post generated.

I was struggling with how to organize that followup when I serendipitously saw that Paul Graham had posted a refutation of my criticisms of Arc. I must say that, given the volume of discussion about Arc, that Paul chose my particular criticism as worthy of refutation really made my day. And reading Paul's posting, as so often happens, helped some of my own ideas gel.

The title of this post is taken from a 1989 paper by Phil Agre and David Chapman entitled What are Plans For? Unfortunately, that paper seems to have fallen into obscurity, but it had an enormous influence on me at the time. The paper questioned the then-canonical view that plans are essentially programs to be executed more or less open-loop by a not-very-interesting (from an academic point of view) execution engine. It proposed an alternate point of view that plans should be considered as generalized information resources for a complex (and therefore academically interesting) execution engine. That led to a number of researchers (myself included) more or less contemporaneously putting this idea into practice, which was widely regarded at the time as considerable progress.

The point of this story is that sometimes the best way to make progress is not to just dig in and work, but to step back and question fundamental assumptions. In this case, I think it's worthwhile asking the question: what are programming languages for? Because there are a lot of tacit (and some explicit) answers to that question that I think are actually constraining progress.

The question is not often asked because the answer at first blush seems to be obvious: programming languages are for writing programs. (Duh!) And so the obvious metric for evaluating the quality of a programming language is: how easy does it make the job of writing programs? On this view, Paul's foundational assertion seems entirely reasonable:

I used what might seem a rather mundane test: I worked on things that would make programs shorter. Why would I do that? Because making programs short is what high level languages are for. It may not be 100% accurate to say the power of a programming language is in inverse proportion to the length of programs written in it, but it's damned close.

If you accept this premise, then Paul's approach of building a language by starting with Lisp and essentially Huffman-coding it makes perfect sense. If shorter-is-better, then getting rid of the odd extraneous paren can be a big win.

The reason Paul and I disagree is not that I question his reasoning, but that I question his premise. Shorter is certainly better all else being equal, but all else is not equal. I submit that there are (or at least ought to be) far more important considerations than brevity for a programming language, especially a programming language designed, as Arc is, for exploratory programming.

One indication that Paul's premise is wrong is to push it to its logical conclusion. For concise programs it's really hard to beat APL. But the drawback to APL is so immediately evident that the mere mention of the language is usually enough to refute the extreme version of the short-is-better argument: APL programs are completely inscrutable, and hence unmaintainable. And so conciseness has to be balanced at least to some extent with legibility. Paul subscribes to Abelson and Sussman's admonition that "Programs should be written for people to read, and only incidentally for machines to execute." In fact, Paul believes this so strongly that he thinks that the code should serve as the program's specification. (Can't find the citation at the moment.)

So there's this very delicate balance to be struck between brevity and legibility, and no possible principle for how to strike it because these are incommensurate quantities. Is it really better to shrink SETF down to one character (=) and DEFINE and DEFMACRO down to 3 each (DEF and MAC) than the other way around? For that matter, why have DEF at all? Why not just use = to define everything?

There's an even more fundamental problem here, and that is that legibility is a function of a person's knowledge state. Most people find APL code inscrutable, but not APL programmers. Text from *any* language, programming or otherwise, is inscrutable until you know the language. So even if you could somehow come up with a way to measure the benefits in legibility of the costs of making a program longer, where the local maximum was would depend on who was doing the reading. Not only that, but it would change over time as the reader got more proficient. (Or less. There was a time in my life when I knew how to solve partial differential equations, but I look back at my old homework and it looks like gobbledygook. And yet it's in my handwriting. Gives you some idea of how old I am. We actually wrote things with our hands when I was in school.)

There's another problem with the shorter-is-better premise, which is that the brevity of a program is much more dependent on the available libraries than on the structure of the language. If what you want to do is part of an available library then the code you have to write can be very short indeed, even if you're writing in Cobol (which is notoriously wordy). Contrariwise, a web server in APL would probably be an awful lot of work, notwithstanding that the language is the very caricature of concision.

I submit that what you want from a programming language is not one that makes programs shorter, but one that makes programs easier to create. Note that I did not say easier to write, because writing is only one part of creating a program. In fact, it is far from clear that writing is invariably the best way to create a program. (In fact, it is not entirely clear that the whole concept of program is even a useful one but that is a topic for another day.) The other day I built a program that does some fairly sophisticated image processing, and I did it without writing even a single line of code. I did it using Quartz Composer, and if you haven't ever tried it you really should. It is quite the eye-opening experience. In ten minutes I was able to build a program that would have taken me weeks or months (possibly years) to do any other way.

Now, I am not saying that Quartz Composer is the Right Thing. I am actually not much of a fan of visual programming languages. (In fact, I am in certain circles a notorious critic of UML, which I consider one of the biggest steps backward in the history of software engineering.) I only want to suggest that the Right Thing for creating programs, whatever it turns out to be, may involve an interaction of some form other than typing text. But if you adopt shorter-is-better as your premise you completely close the door in even considering that as a possibility, because your metric is only applicable to text.

There is another fundamental reason for questioning shorter-is-better, especially for exploratory programming. Exploratory programming by definition is programming where you expect to have to change things after you have written them. Doesn't it make sense then to take that into account when choosing a quality metric for a language designed to support exploratory programming? And yet, Paul writes:

The real test of Arc—and any other general-purpose high level language—is not whether it contains feature x or solves problem y, but how long programs are in it.

Built in to this is the tacit assumption that a shorter program is inherently easier to change, I suppose because there's simply less typing involved. But this is clearly not true. Haskell is also a very concise language, but making changes to Haskell code is notoriously difficult. (For that matter, writing Haskell code to begin with is notoriously difficult.)

Cataloging all the language features that potentially make change easier would take me far afield here. My purpose here is just to point out that the source of the disagreement between me and Paul is simply the premise that shorter-is-better. Paul accepts that premise. I don't.

So what are programming languages for? They are (or IMO should be) for making the creation of programs easier. Sometimes that means making them shorter so you can do less typing, but I submit that that is a very superficial criterion, and not one that is likely by itself to serve you well in the long run. Sometimes investing a little more typing can pay dividends down the road, like making you do less typing when you change your mind and decide to use a hash table instead of an association list.

One thing that many people found unsatisfying about my how-I-lost-my-faith posting is that I never really got around to explaining why I lost my faith other than saying that I saw people being productive in other languages. Sorry to disappoint, but that was basically it. What I think needs clarification is exactly what faith I lost. I did not lose faith in Lisp in the sense that it stopped being my favorite programming language. It didn't (notwithstanding that I switched to Python for certain things -- more on that in a moment). What I lost faith in was that Lisp was the best programming language for everyone (and everything), and that the only reason that people didn't use Lisp is that they were basically ignorant. My faith was that once people discovered Lisp then they would flock to it. Some people (hi Kenny!) still believe that. I don't.

The reason I switched to Python was that, for me, given the totality of the circumstances at the time, it was (and still is, though that may be changing) easier for me to build web sites in Python than it was in Lisp. And one of the big reasons for that had nothing to do with the language per se. It had to do with this. 90% of the time when I need to do something in Python all I have to do is go to that page and in two minutes I can find that someone has already done it for me.

Now, Lispers invariably counter that Lisp has all these libraries too, and they may be right. But the overall experience of trying to access library functionality in Python versus Lisp is night and day because of Python's "batteries included" philosophy. To access library functionality in Lisp I first have to find it, which is no small task. Then I often have to choose between several competing implementations. Then I have to download it, install it, find out that it's dependent on half a dozen other libraries and find and install those, then figure out why it doesn't work with my particular implementation... it's a freakin' nightmare. With Python I just type "import ..." and it Just Works. And yes, I know that Python can do this only because it's a single-implementation language, but that's beside the point. As a user, I don't care why Python can do something, I just care that it can.

(BTW, having adopted Python, I find that the language itself actually has an awful lot to recommend it, and that there is a lot that the Lisp world could learn from Python. But that's a topic for another day.)

Let me close by reiterating that I have the highest respect for Paul. I admire what he's doing and I wish him much success (and I really mean that -- I'm not just saying it because I'm angling for an invitation to a YC lunch). But I really do think that he's squandering a tremendous opportunity to make the world a better place by basing his work on a false premise.

UPDATE: A correction and a clarification:

A lot of people have commented that making changes to Haskell code is not hard. I concede the point. I was writing in a hurry, and I should have chosen a better example (Perl regexps perhaps).

Others have pointed out that Paul's program-length metric is node count, not character count, and so APL is not a fair comparison. I have two responses to that. First, APL code is quite short even in terms of node count. Second, Paul may *say* he's only interested in node count, but the names he's chosen for things so far indicate that he's interested in parsimony at the character level as well (otherwise why not e.g. spell out the word "optional" instead of simply using the letter "o"?)

In any case, even node count is a red herring because it begs the question of where you draw the line between "language" and "library" and "program" (and, for that matter, what you consider a node). I can trivially win the Arc challenge by defining a new language (let's call it RG) which is written in Arc in the same way that Arc is written in Scheme. RG consists entirely of one macro: (mac arc-challenge-in-one-node () '([insert code for Arc challenge here])) Now the RG code for the Arc challenge consists of one node, so RG wins over Arc.

And to those who howl in protest that that is cheating I say: yes, that is precisely my point. RG is to Arc exactly what Arc is to Scheme. There's a lot of stuff behind the scenes that allows the Arc challenge code in Arc to be as short as it is, and (and this is the important point) it's all specific to the particular kind of task that the Arc challenge is. Here's a different kind of challenge to illustrate the point: write a program that takes a stream of images from a video camera, does edge-detection on those images at frame rates, and displays the results. Using Quartz Composer I was able to do that in about ten minutes with zero lines of code. By Paul's metric, that makes Quartz Composer infinitely more powerful than Arc (or any other programming language for that matter).

So the Arc challenge proves nothing, except that Arc has a wizzy library for writing certain kinds of web applications. But it's that *library* that's cool, not the language that it's written in. A proper Arc challenge would be to reproduce that library in, say, Common Lisp or Python, and compare how much effort that took.


Don Geddis said...

Your link to the Python modules is only local to you (file:///), but doesn't work for your readers.

Also, I'd be interested in what you think Lisp can learn from Python on the narrow topic of language design (as opposed to libraries, or social organization, or programming environment, etc.).

Don Stewart said...

You write,

"making changes to Haskell code is notoriously difficult."

I think this is false ("notoriously" eh?)

While Haskell may be unfamiliar to many developers, the language explicitly makes refactoring and changes easier than in most other languages.

Purity and the type system prevent you from having your changes go wrong. In particular, the type system spots where
code breaks as you change it, keeping you on track.

So, if you are a competent Haskell developer, you'll find modifying existing code easier than in languages, simply due to the constrains on purity and type safety.

Sam Hughes said...

What Don Stewart said is not just logical reasoning. I have been able to go into Haskell projects as a complete neophyte (to the project, and to Haskell, somewhat) and make working changes that affect the high-level ways in which the program works, touching code across multiple files, even while heavily sleep-deprived, and have them be correct. (Thanks to the sleep-deprivation, the ugly code and huge patch wasn't pretty one time, but it worked.) Making correct changes that affect operation in interesting ways seems to be much easier in Haskell than other languages.

root said...

> Is it really better to shrink SETF down to one character (=) and DEFINE and DEFMACRO down to 3 each (DEF and MAC) than the other way around?

Better? Not really, they are just different names. Do they both make sense? Definitely. Does it matter in regards to language design? No. (Odd that DEFINE is Scheme and the rest are all Common Lisp though. I assume you meant DEFUN.)

> For that matter, why have DEF at all? Why not just use = to define everything?

Cleanliness. Paul wants to build a lisp, he wants the program to be legible. Though names are often shortened, Arc is not like APL.

> It had to do with this. 90% of the time when I need to do something in Python all I have to do is go to that page and in two minutes I can find that someone has already done it for me.

This is an unfair criticism of Arc, it's still not a real language yet.

I think Cliki is meant to be the module repository of common-lisp. Many libraries can be found from there.

> With Python I just type "import ..." and it Just Works.

Not exactly, sometimes you have to download, uncompress, and easy_install it. But yes, the general point is there. Arc has the potential of doing this later on as well, as it is controlled by PG et. all. The important thing is getting a module system out at some point.

> I find that the language itself actually has an awful lot to recommend it, and that there is a lot that the Lisp world could learn from Python.

I look forward to hearing this.

Compounded Thought said...

I think you and Paul actually agree at the most basic level. I think Paul also is mostly concerned with making programs easier to create. The problem is that there's no good way to quantify how languages perform under that metric. So I think Paul has chosen the closest obvious approximation.

If this is indeed the case, then your criticism really degenerates to being unsatisfied with how close his chosen approximation comes to modelling the real criteria. In this discussion, it's not really useful to push the metric to its logical conclusion--since metaphors in general don't stand up to this kind of thing.

Wilkes Joiner said...

Shrinking defmacro to mac has nothing to with brevity by Paul's definition. He consistently refers to node count.

Ron said...

> Your link to the Python modules is only local to you (file:///), but doesn't work for your readers.

Doh! Sorry about that. Fixed.

Jonathan Ellis said...

Python hasn't been a single-implementation language for a long time now. import [stdlib module] works just fine in IronPython and Jython, too, even though they implement in C# and Java what CPython implements in C.

Perhaps the relevant word here is that Python did start out as an _implementation_, though, rather than a design-by-committee spec.

Of course way back before I was born Lisp started out as an implementation, too. So maybe noting that isn't really useful, except as a cautionary note for the Python in another 10 years or so. :)

Ron said...

> I think you and Paul actually agree at the most basic level.

Well, we do agree on a lot of things, most notably that Lisp is cool. But we have some real disagreements too. For example, I think CLOS is cool, and he doesn't. I think that name-capture in a Lisp-1 is a real problem and he doesn't. And while I do believe that, all else being equal, shorter is better than longer, I do not think that brevity ought to be the be-all-and-end-all of programming language design.

Ron said...

[Ron:] It had to do with this. 90% of the time when I need to do something in Python all I have to do is go to that page and in two minutes I can find that someone has already done it for me.

[Root:] This is an unfair criticism of Arc, it's still not a real language yet.

You took that comment out of context. I was not talking about Arc there, I was referring to switching from Common Lisp to Python for Web programming. I know Arc is a work-in-progress. In fact, I've explicitly said that Arc's web-programming infrastructure, embryonic though it may be, is one of the things I like about Arc.

Ron said...

> Python hasn't been a single-implementation language for a long time now.

Good point.

James said...

My feeling is that there's a qualitative difference between your notion of brevity and Paul's tho'. I think Paul hopes to achieve concision by providing a way to code in a language which requires little for that concision in return. You say the same is achievable through a library call, but given all the libraries I've written in C or Pascal over the years, saving me countless code writing, I still have to give to the language as much as it gives back to me.

So how to minimize this? If Paul is working on the 100 year language, in 100 years the notion of "computer language" may be an anachronism.

horia314 said...

Agree with most of what you say, although I find Arc rather nice. I'd like to note that even though it's basically a "who has the nicest web programming features" shootout (as opposed to a veritable language shootout), it's still relevant. Many (if not all) of today's languages have some sort of web programming support built into them. Some even have it from the start (batteries included a la Python), so in the same sense that Python makes it easy for you to do your work by providing tons of libraries, Arc will make it possible for some hypothetical future user to make web sites really fast.

oz said...

ron writes:
I submit that what you want from a programming language is not one that makes programs shorter, but one that makes programs easier to create.

this is sometimes called expressiveness. in distant past, some people in the programming language community [having fed up with informal claims] worked hard to identify and formally quantify this notion. an example is felleisen's 1990 On the Expressive Power of Programming Languages. it is a rather difficult formal document, that tries to develop a theory of expressiveness. given recent informal claims for languages like ruby (eg. bruca tate, beyond java), and ARC (graham essays and challenges) one wishes there was some fresh work in this area. we are drowning in commentary...

Unknown said...

> I should have chosen a better example (Perl regexps perhaps).

Probably best to choose Perl regexps in a language that isn't Perl, since Perl offers numerous ways to make dealing with regular expressions easier. For (a trivial) example, extended regular expressions which allow whitespace and comments:

$content =~ m{
  (? \d\d\d ) # Save the area code
  [-).]? # optional ending parens, hyphen or dot
  (? \d\d\d ) # Save the first part of local number
  [-.]? # optional hyphen or dot
  (? \d\d\d\d ) # Save last four digits

my $number = $+{NPA} . $+{NXX} . ${XXXX};

Of course, I wouldn't expect anyone with absolutely no regexp experience to know everything that's going on there, but from the comments it shouldn't be too hard to figure out.

In interesting that Regular Expressions and Grammars in Perl 5 and Perl 6 seem to have some similar relationship attributes to Arc and Lisp. In this case, Grammars are seen in many ways as the successor to the terse, confusing symbols used in many regular expressions.