Thursday, May 07, 2015

Why Lisp?

A number of people have contacted me about a comment I wrote yesterday on Hacker News asking me to elaborate, e.g.:
my impression is that lisp is *only* a different notation. Is that correct, or am I missing something? I don't see why it is so important that lisp code matches the data structure (and my assumption is that the match is the answer to 'why lisp') - am I overlooking the importance of macros, or is there even more that I'm still not aware of?
The answer to this question is long, so I thought I'd go ahead and turn it into a blog post.

The short version of the answer is that Lisp is not merely a different notation, it's a fundamentally different way of thinking about what programming is.  The mainstream model is that programming consists of producing standalone artifacts called programs which operate on other artifacts called data.  Of course, everyone knows that programs are data, but the mainstream model revolves around maintaining an artificial distinction between the two concepts.  Yes, programs are data, but they are data only for a special kind of program called a compiler.  Compilers are hard to write, a field of study unto themselves.  Most people don't write their own compilers (except occasionally as academic exercises), but instead use compilers written by the select few who have attainted the level of mastery required to write one that isn't just a toy.

The Lisp model is that programming is a more general kind of interaction with a machine.  The act of describing what you want the machine to do is interleaved with the machine actually doing what you have described, observing the results, and then changing the description of what you want the machine to do based on those observations.  There is no bright line where a program is finished and becomes an artifact unto itself.  Yes, it is possible to draw such a line and produce standalone executables in Lisp, just as it is possible to write interactive programs in C.  But Lisp was intended to be interactive (because it was invented to support AI research), whereas C was not (because it was invented for writing operating systems).  Interactivity is native to Lisp whereas it is foreign to C, just as building standalone executable is native to C but foreign to Lisp.

Of course, there are times when you have no choice but to iterate.  Some times you don't know everything you need to know to produce a finished design and you have to do some experiments, and the faster you can do them the better off you will be.  In cases like this it is very helpful to have a general mechanism for taking little programs and composing them to make a bigger program, and the C world has such a mechanism: the pipe.  However, what the C world doesn't have is a standard way of serializing and de-serializing data.  And, in particular, the C world doesn't have a standard way of serializing and de-serializing hierarchical data.  Instead, the C world has a vast array of different kinds of serialization formats: fixed-width, delimiter-separated, MIME, JSON, ICAL, SGML and its offspring, HTML and XML, to name but a few.  And those are just serialization formats for data.  If you want to write code, every programming language has its own syntax with its own idiosyncrasies.

The C ecosystem has spawned the peculiar mindset that thinks that syntax matters.  A lot of mental energy is devoted to syntax design.  Tools like LEX and YACC are widely used.  In the C world, writing parsers is a big part of any programmer's life.

Every now and then someone in the C world gets the bright idea to try to use one of these data serialization formats to try to represent code.  These efforts are short-lived because code represented in XML or JSON looks absolutely horrible compared to code represented using a syntax specifically designed to represent code.  They conclude that representing code as data is a Bad Idea and go back to writing parsers.

But they're wrong.

The reason that code represented as XML or JSON looks horrible is not because representing code as data is a bad idea, but because XML and JSON are badly designed serialization formats.  And the reason they are badly designed is very simple: too much punctuation.  And, in the case of XML, too much redundancy.  The reason Lisp succeeds in representing code as data where other syntaxes fail is that S-expression syntax is a well-designed serialization format, and the reason it's well designed is that it is minimal.  Compare:

XML: <list><item>abc</item><item>pqr</item><item>xyz</item></list>

JSON: ['abc', 'pqr', 'xyz'] 

S-expression: (abc pqr xyz)

The horrible bloatedness of XML is obvious even in this simple example.  The difference between JSON and S-expressions is a little more subtle, but consider: this is a valid S-expression:

(for x in foo collect (f x))

The JSON equivalent is:

['for', 'x', 'in', 'foo', 'collect', ['f', 'x']]

Rendering that into XML is left as an exercise.

The difference becomes particularly evident if you try to type those expressions rather than just look at them.  (Try it!)  The quotes and commas that seem innocuous enough for small data structures become an immediately intolerable burden for anything really complicated (and XML, of course, like all SGML-derivatives, is just completely hopeless).

The reason that Lisp is so cool and powerful is that the intuition that leads people to try to represent code as data is actually correct.  It is an incredibly powerful lever.  Among other things, it makes writing interpreters and compilers really easy, and so inventing new languages and writing interpreters and compilers for them becomes as much a part of day-to-day Lisp programming as writing parsers is business as usual in the C world.  But to make it work you must start with the right syntax for representing code and data, which means you must start with a minimal syntax for representing code and data, because anything else will drown you in a sea of commas, quotes and angle brackets.

Which means you have to start with S-expressions, because they are the minimal syntax for representing hierarchical data.  Think about it: to represent hierarchical data you need two syntactic elements: a token separator and a block delimiter.  In S expressions, whitespace is the token separator and parens are the block delimiters.  That's it.  You can't get more minimal than that.

It is worth noting that the reason the parens stick out so much in Lisp is not that Lisp has more parens than other programming languages, it's that Lisp as only one block delimiter (parens) and so the parens tend to stick out because there is nothing else.  Other languages have different block delimiters depending on the kind of block being delimited.  The C family, for example, has () for argument lists and sub-expressions, [] for arrays, {} for code blocks and dictionaries.  It also uses commas and semicolons as block delimiters.  If you compare apples and apples, Lisp usually has fewer block delimiters than C-like languages.  Javascript in particular, where callbacks are ubiquitous, often gets mired in deep delimiter doo doo, and then it becomes a cognitive burden on the programmer to figure out the right delimiter to put in depending on the context.  Lisp programmers never have to worry about such things: if you want to close a block, you type a ")".  It's always a no-brainer, which leaves Lisp programmers with more mental capacity to focus on the problem they actually want to solve.

And on that note, I should probably get back to coding.  Iteratively, of course :-)

[This post has been translated into Chinese and Japanese.]

12 comments:

  1. I appreciate the minimal syntax shared between data and code (executable data?) but that doesn't seem to be the only reason that Lisp is better. I still don't understand why often the answer to "Why LISP?) is answered by "Macros." How do these two concepts tie together in a way that makes programming a more intimate approach to the machine itself?

    ReplyDelete
  2. @Manly Geek has it right.

    If you compare clara to drools, drools is a disaster because it incorporates a half-baked Java parser, which means the error messages you get makes no sense at all.

    Many forms of LISP don't face the problems of modern programming squarely, but Clojure does.

    ReplyDelete
  3. Why Lisp? Thanks to the spirit of Lisp a non professional coder, as I am, can build a tool fitting his needs, like this: alphawiki. I don't think that I could have done such a thing with other languages.

    ReplyDelete
  4. @manlyGeek:

    An awful lot has been written about macros and I didn't really want to reinvent the wheel here. A full answer to your question would be another post (maybe tomorrow) but the TL;DR version is: yes, macros are important. But it's the notation that makes macros possible.

    The key insight is that Lisp programs are not text, they are data structures. You can build those data structures by parsing text, or you can build those data structures in other ways. Macros are one of those other ways.

    ReplyDelete
  5. @Ron,
    That's actually the kind of association between macros and that minimal syntax that I would love to see demonstrated. I "know" it's there, but I haven't seen an example that brings it home for me.

    ReplyDelete
  6. Good post, clear idea. It took me a while to figure all that out.

    Doug Hoyte say some similar words on his book "Let Over Lambda".

    ReplyDelete
  7. Ron,

    This ranks with Paul Graham's succinct explanation as the most enticing reason to start learning Lisp that I've read. Super exciting stuff. I honestly wish I could put down the mainstream compiler now and start immediately, but I'll have to wait a little bit.

    I'm entering a period where I will not be a wage slave, but am sufficiently interested in SW architectures and learning that I will not be idle.

    I think your writings are hitting a lot more minds than your comments would suggest. I won't demand you write a book this time (haha although I still have hope), I'll just say many thanks. This post has affected my plans over the next several years, at least in how I plan to spend my idle hours.

    There's tons of writing out there hammering it home how important it is to learn to use Lisp. But your method of the carrot is so much more effective in my case than the stick. At a certain point of being overwhelmed by all the stuff you're supposed to learn to be a great SW person, you just stop caring and throw it on the pile of stuff you're ignoring. Kinda like an overflowing email inbox.

    When I'm fascinated by something, I'm one of the types that will spend 16 hours a day researching it if allowed. This really sounds like something where I could just have an idea, quickly implement it, then laugh or scratch my head at the result. Do some thinking/research, rinse, repeat. Those types of systems are addictive, and are the types of things that lead to decade-long obsessions.

    Thanks again for infecting my mind in a good way.

    Sincerely,
    oft-lurker (and link-sender), rare-commenter

    ReplyDelete
  8. > This ranks with Paul Graham's succinct explanation as the most enticing reason to start learning Lisp that I've read.

    Wow, that is high praise indeed. Thanks!

    ReplyDelete
  9. And to be clear, when I said "your comments" I really meant the comment section on this blog.

    Out of curiosity, do you have a solid idea of how many people read your blog? I know you know the page hits, I'm just curious if you know how many regular readers you have that really read the s$$$ out of something you wrote. This is to me a very niche place, where you can get very high quality writing that doesn't appeal to the vast majority.

    (the philosophy posts go right over my head, but I try. Still better than reading a textbook, ie hearing it from someone that considers themself a layman and tries to describe it that way)

    ReplyDelete
  10. > Out of curiosity, do you have a solid idea of how many people read your blog?

    Nope. I have 90 official subscribers (picked up 4 new ones just in the last week) and I tend to get a few hundred pageviews on a normal day, but I have no idea how much of that is bot traffic. When I get onto the HN front page that jumps up dramatically. "Why Lisp" has gotten 41,000 pageviews so far.

    Some time back I heard some informed speculation that I actually have a few hundred people reading the Ramblings regularly on stealth RSS feeds. But the truth is I don't know. I only have half a dozen or so regular commenters :-(

    > I think your writings are hitting a lot more minds than your comments would suggest. I won't demand you write a book this time (haha although I still have hope), I'll just say many thanks.

    Thanks very much for the kind words! Getting feedback like that occasionally really keeps me going. I actually am planning to write a book some day, but right now my startup is taking up all of my time.

    ReplyDelete
  11. Cool.

    Yeah, not sure how much you realize it, but you've got the rare piece of connecting laymen to concepts that are normally only available via textbooks or hardcore study. (talking out of my ass at this point, since I've neither done it nor have the other part): seems to me like you had to live one part with your life in the game and really experienced it, and then get some distance via quitting/time, then you can write about it clearly. Both people with their nose in the game still, and spectators, cannot do that without major flaws, or missing the forest for the trees. Thus, a pretty small minority can do it well.

    In areas where I'm very green, I seek out those authors. I don't mean to single out this example as much as it seems I am, but I love the book A Short History of Nearly Everything by Bill Bryson. If you wanted to give a 12 year old a book they could finish in a couple days and give them a pretty good idea of how things got to the way they are, that would be my choice. (haha, well, minus politics and religion and wars and nasty human behavior and all the rest. But I'd spare the poor kid that part. I guess remove humans and the sentence is valid.)

    Okay, sorry, I'm probably actually putting annoying pressure with the compliments again. I just find succinct writers willing to go on tangents to be rare (sounds like a contradiction, but I can't resolve it briefly). I'll start my own blog and then you can come pester me :)

    ReplyDelete
  12. That was very nicely done!
    I love the ideas of LISP but there's a reason why people don't take to it and it's not all about the strange parens. It's the mindset.

    It's very difficult to reason about LISP programs especially as a newbie. The syntax of a C like language helps to deliniate ideas whereas LISP is one thing (Sexp ).This seems like it should be easier but it's the opposite. The lack of syntax forces a lot of cognitive load on the reader of the program. The complexity of syntax is shifted into the users mind. Of course, the more you use LISP, the easier it gets. It's just a lot more difficult of a language to begin with. Simple is not necessarily easy.

    ReplyDelete