It might appear that I have been going to some pretty extreme lengths to belabor the obvious: we live in a world populated by material objects made of atoms. Atoms exist in particular places at particular times and move around according to laws. Collections of atoms are called systems and the positions of atoms within a system are called states. Correlations between states are called information.
All this might look like I'm trying to dodge some of the hard problems, but in fact it's the exact opposite. I'm trying to set the stage for talking about one of the hard problems, namely, where does information come from? How is it created? For this question to even make sense we need a precise definition of what we are talking about, of what information actually is, which is why all this prep work has been necessary.
But we need more than just that, because not all information seems to be created equal. There are some kinds of information whose creation appears straightforward. For example, a thermometer contains information about the temperature of its surroundings. But there is no great mystery there. It's obviously just a straightforward natural process, though it turns out that this process is not quite as easy to describe as you might first suspect. Still, there is no need to invoke anything beyond the simple lawful behavior of atoms (and few odds and ends like electrons and photons) to explain thermometers and light switches and cameras and even computers. Thermometers and light switches and cameras and computers are all just Atoms Doing Their Thing (ADTT).
But there are two kinds of systems in nature that appear to be qualitatively different: DNA and human brains. Both of these contain information (obviously) but the information contained in DNA and human brains seems at first glance to be of a qualitatively different character than that contained in thermometers and light switches and cameras. But note well that I had to leave computers off that list. It is not at all clear any more whether human brains are qualitatively different from computers, or exactly what distinguishes the behavior of brains and computers, at least in terms of the information they contain. Up until a few years ago this was still a philosophical problem. There were legitimate-sounding arguments that there were things brains could do that computers could not, but these have all be utterly destroyed by technological progress over the last few decades. It was once argued that computers could never beat humans at chess, or speak Chinese, or distinguish dogs from cats. Just in my lifetime all of these things have gone from cutting-edge research to borderline trivial. A computer that will utterly crush any human at chess today costs a few hundred dollars and fits comfortably in the palm of your hand.
But, of course, computers would not exist without human brains to build them. Large Language Models would not work without a corpus of text generated by human brains to train on. And brains would not exist without DNA. So there still seems to be something special about DNA and brains, some quality that distinguishes them from thermometers and light switches and cameras.
What is this quality, where does it reside, and where does it originate? Where exactly does the hypothesis that everything we observe can be accounted for by Atoms Doing Their Thing fail?
One possibility is the capacity of the human brain for invention, for creating new information. LLMs can mimic the input-output behavior of brains, but only if they are first trained on a vast corpus of information that was produced by brains. All that information had to originate somewhere. The only possible source of that information, it would seem, is human brains, and specifically, some quality that brains have that computers lack for creativity and originality, for creating new and interesting information. What else could it possibly be?
Notice that I quietly snuck a new concept in there: new and interesting information. Creating information, that is, correlations between states of systems, is, as noted earlier, not hard. Even creating new information is not hard: just flip a coin and look at it. The information about which side landed up is new. It didn't exist before the coin landed. (Note that this might not actually be true, but that's a deep, deep rabbit hole, one which we will explore later. But for now let's just assume that it is not possible to predict the result of a coin flip, and so it really does produce new information.) But this information is obviously completely different than the kind of newness in an original novel or poem or invention. So it is not just newness that matters, it is something else, and that "something else" is what I call being interesting.
What exactly makes new information interesting? That question is even harder to answer than what makes something a chair, but it doesn't really matter. What matters is that humans mostly agree on what "interesting" means in this context, just like they mostly agree on what a chair is, and so this agreement is itself an observation that we need to explain. And eventually we will, but not just yet.
For now, let's just accept that humans seem to be able to recognize and (mostly) agree on some distinction between interesting and uninteresting information in much the same way that they can recognize and (mostly) agree on some distinction between chairs and non-chairs. This turns out to have surprisingly far-reaching and profound implications.
Consider someone named Audrey who writes an original poem or essay or novel, and a second person named Paul who makes a copy of Audrey's work and puts his own name on it. We call Audrey an Author and Paul a Plagiarist because Audrey has created new, interesting information and Paul hasn't.
But plagiarism is not limited to copying original work like poems and essays and novels. Consider a journalist (let's call her Jane) who writes news stories. Copying one of Jane's stories would also be considered plagiarism despite the fact that news stories are not original in the same sense that poems and essays and novels are. Indeed, the whole point of journalism is the exact opposite. The value of journalism is that the information it contains reflects objective facts and are not an invention of the author. So why is journalism considered valuable work? Why do news stories have by-lines? Why is copying a new story and re-publishing it under your own name considered plagiarism? Aren't news stories just facts? How can you claim ownership over a fact?
The answer is that yes, news stories are "just facts", but they are a very particular kind of fact. They are interesting facts. They are relevant facts. The value in journalism is not in the generation of the information -- that is done by current events. The value in journalism is in the filtering, the separation of the wheat from the chaff, the relevant events from the irrelevant details, the interesting from the uninteresting.
This idea of "filtering out" relevant and interesting things from irrelevant and uninteresting things will turn out to be absolutely crucial to our understanding of the world. It will turn out that everything can be understood in these terms, even the very existence of atoms, though it will be a long, long time before we get there. For now, I just want to point out that even the creation of original work like poems and novels and blog posts can be explained not as a phenomenon in its own right, separate from the kind of filtering that goes on in journalism, but as an instance of the exact same process.
I can tell you from personal experience that the contents of this blog did not spring fully formed into my mind. What you are reading now is the result of a filtering process. Before I write a single word, I have to do a lot of reading. I don't have the mental capacity to remember everything I read, so I have to filter out the relevant and interesting bits from the irrelevant and uninteresting bits. Then I start to kick around ideas for what to write about, and the same thing happens. Ideas pop into my head and I mentally sift through them thinking, "Nope, that's crap. Nope, that's crap. Nope, nope, nope... well, hmm, maybe..."
Then I write a draft, read it over, decide it's crap, throw it out, and write something different. I usually end up throwing out much more text than I actually publish. Right now, as I write these very words, there is a pile of text in this file that I wrote earlier but decided was crap after reading it over. I'm keeping it around just in case there turns out to be something salvageable in it, but most likely all of it is just going to get thrown out, never to been seen by any human eyes other than my own because, well, it's crap. The text I'm planning to discard is, as I write this, about twice as long as the text I'm currently planning to keep and publish, and God only knows how many bad ideas flitted through my head that never even made it to the keyboard at all.
And then, finally, after all that, after the reading and the writing and the rewriting and the re-rewriting, after at long last I click on the "publish" button, there is yet another filtering process that is performed by my audience (such as it is) where they -- you -- read what I wrote, decide if it has merit, and maybe, if I'm lucky, leave a comment, or recommend to someone else that they might want to read what I've written. Maybe, some day, if I am very, very lucky, my words might get the attention of an agent or a publisher, whose entire job is to filter out relevant and interesting information from the torrent being constantly generated by aspiring writers all over the world, the vast majority of whom are doomed to never have anything enter their brains that gets past anyone else's filter and thus sink into quiet obscurity in the good company of countless prior generations of aspiring authors.
In other words, what we think of as originality is actually just the result of a lot of sifting through crap to find the good stuff. Humans discover the good stuff, they recognize the good stuff, but they don't actually produce it. Originality comes not from being the first to generate a new idea, but being the first to recognize it and promulgate it. The actual generation of new ideas isn't the hard part. The hard part is the filtering, recognizing the good ideas.
Notice how much this explains. It explains why journalism is considered valuable work despite the fact that originality is antithetical to good journalism. It explains why editing is considered valuable work. It explains why people can make a living as literary agents and script readers. It explains why AIs can be so good at writing really bad prose. It explains why good ideas and good books and good blog posts are rare and can't just be generated on demand.
Now, this does not entirely eviscerate the notion that human brains are special, but it does take us down a peg. The thing that makes us special is no longer our ability to come up with new ideas, but simply to recognize good ideas, which seems not quite as wonderful and special and magical as coming up with them in the first place. But it still demands an explanation of how the filtering works, and why humans seem to be so much better at it than other animals. (Whether we will actually turn out to be better at it than computers remains to be seen.)
As a sneak preview, the answer to this question will turn out to be that the laws of physics have built in to them a kind of "ur-filter", the mother of all filters, which ends up filtering out our ability to filter out interesting information. Actually, there are two of these ur-filters. One of them is Darwinian natural selection, which is a filter for the ability of a system to make a copy of itself. And the second is something called quantum entanglement, which is a filter for the phenomenon of information itself. Entanglement is what produces correlations between systems. I have to emphasize that I do not expect you to believe any of this yet. To say that there are a lot of details still to fill in would be a colossal understatement. But sometimes it can help to follow the path if you know ahead of time where it is leading, even if the end is still very far off in the distance.
Thursday, June 04, 2026
Seeking God in Science part 9: Creating Information
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment