Saturday, February 24, 2018

Devin Nunes doesn't realize that he's part of the government

I was reading about the long anticipated release of the Democratic rebuttal to the famous Republican dossier memo.  I've been avoiding writing about this, or any aspect of the Russia investigation, because there is just so much insanity going on there and I didn't want to get sucked into that tar pit.  But I could not let this slide:
[O]n Saturday, committee chairman Devin Nunes (R-Calif.) accused Democrats of colluding with the government in a “cover up” of information as he announced the memo had been posted online. 
“We actually wanted this out,” Nunes told an audience at the Conservative Political Action Conference. “It’s clear evidence that the Democrats are not only covering this up, but they’re also colluding with parts of the government to cover this up.”
Congressional Democrats are "colluding with the government"?  Say what?!?  Who (or what) exactly does Devin Nunes think "the government" is?

I have news for you, Mr. Nunes: you and your fellow Congressmen are "the government", or a pretty significant part of it anyway.  You're supposed to be "colluding" (I would choose the word "working", but whatever) with yourselves to run the country.  That's your job.

But I guess Devin Nunes didn't get the memo.  So sad.

Friday, February 16, 2018

Yes, code is data, but that's not what makes Lisp cool

There has been some debate on Hacker News lately about what makes Lisp cool, in particular about whether the secret sauce is homo-iconicity, or the idea that "code is data", or something else.  I've read through a fair amount of the discussion, and there is a lot of misinformation and bad pedagogy floating around.  Because this is a topic that is near and dear to my heart, I thought I'd take a whack at it myself.

First, the idea that "code is data" is a popular aphorism, but even the most casual reflection on what this slogan actually means will reveal that it can't possibly be the right answer.  Yes, it is true that, in Lisp, code is data.  But this is true of all programming languages!  If programs weren't data, how could a compiler possibly work?

What makes Lisp cool is not that programs are data (because all programs are data), but that they are a particular kind of data.  In most programming languages, programs are strings.  Strings are in fact data.  In Lisp, programs are not strings, they are linked lists (that happen to have a string representation).  And this turns out to make all the difference.

I want to be very clear about what I mean when I say that Lisp programs are linked lists, because this is really a very subtle point.  It's hard to explain, which is one of the reasons that it is very rarely explained well.  Ironically, part of the problem is that once you understand it, it seems trivial and obvious.  (Everything is easy once you know how.)  But if you don't already understand it, it can be hard to get over the hump.  So depending on which side of this divide you fall on, what I am about to say might sound like I'm belaboring the obvious, in which case I would ask you try to remember back to the time before you understood all this (I know there was such a time because no one is born understanding linked lists).

The fundamental problem with trying to explain this is that the only tool I have at my disposal to communicate with you is text.  Your eyes scan this page, parse the black markings on the white background, and interpret those markings as letters and punctuation marks.  Your mind then further groups those letters into words, words into sentences, and sentences into concepts.  You do all this effortlessly now, but back in the day, before you knew how to read, it was hard work.  It takes a similar kind of "hard work" to read Lisp.  But like learning to read natural language, it pays dividends.

So here's a little snippet of Lisp code:

(defun rsq (x y) (sqrt (plus (times x x) (times y y))))

This code defines a function called RSQ which computes the square root of the sum of the squares of two numbers, but that is beside the point. What matters is that there are two very different ways to interpret the combination of letters and punctuation marks on the line above:

1.  As a string of characters.  44 of them to be precise (if you count spaces).

2.  As a thing with structure that is defined by the parentheses.

This is a little easier to see if we write something that has the same structure but without the evocative words:

(a b (c d) (e (f (g c c) (g d d))))

This makes the structure a little easier to see.  What you are looking at can be interpreted as either a string of characters (35 of them in this case), or as a list of more abstract elements.  This particular list has four elements.  The first element is the letter "a".  The second element is the letter "b".  But the third element is not a letter, it is another list.  This list has two elements, each of which is a letter ("c" and "d").  The fourth element is also a list.  This one has three elements, two of which are themselves lists.

There is an extra wrinkle in the original example, which is that sequences of adjacent letters like "defun" and "sqrt" are also considered "one thing", or a single element of the list.  So the original example, like the second, is also a list of four elements, but the first element is not a single letter, but a "group" of letters.  In Lisp these groups are called "symbols", and like lists, they are first-class data types.

The reason this is hard to explain is that strings and lists are fundamentally different things even though they look the same when you write them out this way.  What I've written above are really strings, but your brain interprets those strings as lists once you've been trained to interpret the parentheses and the letter groupings and spaces in the right way.  But what a linked list really is is something completely different.  It's a pattern of bits in memory.  You can talk about that by dumping the contents of memory and talking about how some bit patterns can be interpreted as pointers that refer to other parts of memory, or by drawing boxes and arrows.

But all of those details are a distraction too.  What really matters is that by thinking of code as a linked list instead of as a string of characters you can manipulate that code easily in terms of components that are semantically meaningful.

Here's an example of what I mean by this.  Consider the following snippet of C code:

int main(int argc, char* argv[]) { ...

Now suppose you want to analyze this code.  We want to extract, say, the name of the function being defined ("main") and its arguments ("argc" and "argv").  In C this is an advanced exercise; you have to actually parse the code.  But in Lisp it is utterly trivial.  If I consider this code:

(defun rsq (x y) (sqrt (plus (times x x) (times y y))))
as a list then to get its name all I have to do is extract the second element of the list.  And all I have to do to get the arguments is take the third element.  And the functions to do that are built-in to Lisp, so I literally don't have to write any code!

Not only that, but the parsing process that converts the string representation of the list (called an S-expression) to the internal representation of the actual linked list data structure is also trivial.  Parsing S-expressions is super easy.  You don't need a grammar or a parser generator, all you need is -- and this is no exaggeration -- a few dozen lines of code in just about any programming language [1].  And going the other way -- printing them back out -- is even easier.

This, then, is the magic of Lisp.  It's a local minimum in the amount of effort that it takes to parse and manipulate code in semantically meaningful chunks at the "cost" of having to write code that looks a little bit weird when you first encounter it.  But this feeling quickly goes away when you realize that this weirdness is not arbitrary.  Those parens are where they are for a reason, namely, to make the syntax easy, even trivial, to parse.  Lisp was originally proposed with a more traditional syntax in addition to S-expressions, and nearly every Lisp programmer has proposed and implemented their own (it's almost a rite of passage).  None of them have ever caught on because S-expressions are a huge win once you get even just a little bit used to them.  They let you do things easily that are really really hard in other languages.  In particular, they make writing compilers so easy that doing so becomes a regular part of doing business in Lisp rather than an abstruse specialty that only a select few engage in.

And now I have to go fix some code so that it automatically generates a backtrace whenever it encounters an error, logs it, and then continues its computation as if the error had not occurred (because it's running inside an event loop where actually throwing an error would be catastrophic).  I expect this will take me about fifteen minutes because I have this in my toolbox.

---
[1] Yes, that's more than a dozen lines of code, but that's because what you see there is a complete Lisp interpreter, not just an S-expression parser.  The parser is at the bottom.  It's 30 LOC.