Rondam Ramblings

Monday, July 29, 2024

Computation: Math and the Church-Turing Thesis

(Part 9 in a series on the scientific method)

In the last installment of this series I addressed a philosophical question: what does it mean for a mathematical statement to be true? I tackled this in the context of a general definition of scientific truth, namely, that a statement is true if it corresponds on some way to the actual state of affairs in objective reality.

(Note that this definition does not assume that objective reality exists. It's possible that there is no objective reality, and that therefore there are no true statements under this definition. But the existence of objective reality is, to put it mildly, a fairly well-established scientific theory, and so we're justified in proceeding on the assumption. Spoiler alert: when we get to talking about quantum mechanics, we will encounter some data that calls the existence of objective reality into serious question, often leading to much philosophical hand-wringing. I'm sweeping all that under the rug for now, but I wanted to be up-front about the fact that I am sweeping it under the rug. I promise to unsweep it later.)

In this installment I want to talk about a different aspect of math, namely, how to explain its ubiquity in science. In this series I have been trying very hard to avoid math because it tends to scare people away. I think that's mainly due to very poor math pedagogy. Math need not be any scarier than science itself. One of the things I hope to accomplish here is to demystify math a bit and make it a little less intimidating.

So here I am going to approach the phenomenon of mathematics as a scientific Problem (with a capital P), namely: why is math so ubiquitous in science? And why is it so alien and intimidating?

Let's start by asking: what actually is math? It's hard to formulate a coherent definition. Wikipedia says, "Mathematics is a field of study that discovers and organizes methods, theories and theorems that are developed and proved for the needs of empirical sciences and mathematics itself." That might be technically correct, but it's not very illuminating, so instead of trying to define it, I'm going to sweep that under the rug as well and simply observe that math, kind of like pornography, seems to be an identifiable thing that most people recognize when they see it.

One of the characteristic features of math that makes it recognizable is its use of weird symbology, funny-looking squiggles, Greek letters, non-Greek letters written in distinctive fonts (and, of course, numbers). Math also attaches meaning to the relative spatial relationship between symbols in ways that no other form of human communication follows. x² means something very different from x₂ which means something very different from 2x.

But all this weird symbology is just window-dressing. None of it is essential; it is possible to do math without it. In fact, for most of human history math was done using plain old language, augmented with an occasional drawing. The only "weird symbols" were numbers. You can see this by perusing the original manuscript of Newton's Principia, the book that launched the modern scientific revolution. It consists almost entirely of (latin) text and some drawings. There is some math, but none of the notation goes beyond what you would find in an elementary-school math class. The symbology used today only came into being in the last few hundred years. It was invented because doing math with regular language was becoming too unwieldy. The state-of-the-art technology for doing math at the time was pen and paper and chalkboards, and so the notation was optimized to allow a lot of information to be compressed into a minimal amount of space. But what matters is not the symbology, but rather what the symbols mean. "√3" means exactly the same thing as "the square root of 3", which means exactly the same thing as, "The number which, when multiplied by itself, gives you three." The only reason to use "√3" is that it takes up less space and is faster to write by hand.

But there is more to math than just weird symbols. Not all combinations of symbols make sense. "√3" makes sense but "3√" doesn't. It's no different from any other language. "The square root of three" makes sense, but "three root square the of" does not. Math notation is just another language, and it has grammatical rules just like any other language.

So what exactly is it about math that makes it special? If the symbology is just window-dressing, and the grammatical rules are no different from natural language, what exactly is it that makes math such a powerful lever for doing science? The answer is that in addition to grammatical rules, which tell you how to compose mathematical symbols so that they say something that makes sense (but isn't necessarily true), there is another set of rules in math that don't have a counterpart in natural language. These rules tell you how to manipulate mathematical symbols to produce new mathematical truths, arrangements of symbols that correspond in some way to objective reality, from old ones.

The most familiar example of these rules is the algorithms we learn in elementary school for doing arithmetic. Using these rules we can know, for example, that if we take (say) a hundred and ninety-two things and combine them (for some manner of combination) with (say) three hundred and seventeen things that we will end up with five hundred and nine things. We figure this out not by thinking about the actual words that I wrote out, nor directly about the quantities those words denote, but instead in terms of the symbols "192+317=509". By using these symbols and the right set of rules for manipulating them, we can figure out that 192 plus 317 is 509 a lot faster than we could if we tried to actually count these things.

The art of mathematics, the thing that keeps mathematicians employed, is figuring out what the right rules are. That is a topic for another day, with a long and storied history. The main point here is that the thing that makes math useful in science is not its notation, not the weird symbology, but the rules, the fact that it is possible to create sets of rules for manipulating symbols such that the behavior of the symbols corresponds reliably to the behavior of objective reality. In other words, it is possible to create sets of rules for manipulating symbols that turn the symbols into a model of objective reality. This turns out to be much more than just a handy-dandy way of producing models; it turns out to have some truly profound philosophical implications. To see why, we need to take a brief historical detour.

In 1936, Alan Turing published a paper entitled "On Computable Numbers, with an Application to the Entscheidungsproblem" (which is German for "decision problem"). The title obscures what this paper is really about. In it, Alan Turing invented software.

In 1936 the words "computer" and "calculator" did not refer to machines as they do today but rather to a profession. A computer (or calculator) was a human who performed computations (or calculations), which is to say, who manipulated mathematical symbols according to rules. The useful results of this process were generally numbers, which is why the title of Turing's paper refers to "computable numbers", but numbers are a red herring here. What matters is not the numbers per se, but the symbols; that the symbols can be taken to stand for numbers is a mostly irrelevant detail. What really matters is the rules by which the symbols are manipulated.

Turing's paper asks the question: what would happen if we tried to build a machine that does what (human) computers/calculators do? He then goes on to describe a machine that would do just that. The machine consists of a control mechanism which can be in any of a finite number of states. These states encode the rules by which the machine manipulates symbols. The machine has a sheet of paper divided into a grid. Each square of the grid contains exactly one of a finite repertoire of symbols. The symbols can be erased and re-written by the machine. The supply of paper is endless. (The technical term is "unbounded", which is not quite the same as "infinite". It's more like there is a paper factory available that makes more paper as needed, but at any given time the actual amount of paper that the machine has to work with is finite. The distinction between infinite and unbounded won't matter much here but will turn out to be crucial later on.)

The machine's operation is dead-simple. At any given time, its control mechanism is in one of the possible states, and is "looking" at one square of the grid on the paper. The control mechanism is just a big lookup table indexed by state and symbol. Each entry in the table contains just three pieces of information: a symbol to write onto the paper in the current grid square (possibly the same as the symbol that is already there), a direction in which to move on the paper to arrive at a new grid square, and a new state for the control mechanism to go into. That's it.

I've described Turing's machine in terms of a "sheet" of paper, but Turing's original paper and the literature in general refers to a one-dimensional "tape" rather than a two-dimensional "sheet". This turns out not to matter. You can think of the "tape" as being a "sheet" that has been cut up into strips and glued end-to-end. The net result is the same in both cases. But in order to conform to tradition, I'm going to switch my terminology and start using "tape" instead of "sheet".

Note that the control mechanism is part of the machine's hardware, and once built, it cannot be changed. If you want to make a Turing machine that performs a different task, you need to build a new machine. Or so it would seem at a casual glance.

The remarkable thing about Turing machines is that it is possible to create a control mechanism that is universal, that is, which allows one machine to perform the same computation as any Turing machine you can imagine without changing the hardware. How does it do this? By describing the control mechanism of another Turing machine as symbols on the tape. The universal machine reads the description on the tape, and then behaves as if its control mechanism were the one described on the tape rather than the (universal) control mechanism that it actually has.

At first glance this seems like an interesting intellectual curiosity, but not one with earth-shattering consequences. From our current vantage point in a world with ubiquitous computers it might even seem a little banal. But the computer revolution we are living in is only possible because of the existence of universal Turing machines. Computing and calculating machines existed before 1936, but they were all custom-made for a particular task. But if you build a (universal) Turing machine, that is the last machine you will ever have to build. After that, anything you can imagine that can be done by any Turing machine can be done just by writing symbols to a tape, that is, by writing software. There are only three reasons to every build new hardware: to make the machine run faster, to make it more energy efficient, or if you want to do something that cannot be done by any Turing machine.

Let's think about that third possibility for a moment. A universal Turing machine can do anything that any Turing machine can do, but that doesn't mean that it can do anything. In fact, Turing himself showed that there are certain things that no Turing machine can do, like solve the Entscheidungsproblem or the closely related halting problem. Might we be able to design a completely different machine that can solve these problems?

There is a word for such a machine: it is called an oracle for the decision problem (or the halting problem, or whatever). But think about what it would take merely to design or even describe such a machine, let alone build one. Such a design would have to be rendered into a phyical medium somehow; it would have to be written down, or carved into stone, or spoken aloud as an oral tradition, or something. And if the design could be rendered into a physical medium, then it could be expressed in terms of symbols. Maybe part of the design would consist of drawings, but a drawing can also be expressed in terms of symbols, specifically, in the form of a computer program (i.e. a Turing machine) that produces that drawing on a suitable display. Nowadays just about every image you see is actually produced in this way.

Maybe there is some way of rendering a design that cannot be reduced to symbols? There is no way to prove that this is impossible, but if you can come up with one, it would be one of the greatest breakthroughs in the history of human intellectual endeavor. It is hard to imagine even what a description of such a breakthrough would look like. Think about it: all of the information your brain has access to comes through your senses, primarily seeing and hearing. But we already know that these can be reduced to symbols because we have computers that can produce any image or sound that we can imagine. Not only that, but we also have digital cameras and digital microphones, so computers can reproduce not only any image or sound that we can imagine, but also any image or sound that can possibly be produced in this universe. If it exists in this universe, then we can record it or take a picture of it, and thus reproduce it on a computer.

We can stretch our imaginations even further. Maybe we cannot design or describe a machine that does something a Turing machine cannot do, but maybe we can produce such a machine by some other means. Maybe there exists a natural phenomenon in our universe whose behavior cannot be described by symbols. Again, it is hard to imagine what such a phenomenon would even look like. It would not be possible to describe such a phenomenon in terms of measurements because all measurements can be rendered as symbols. It would even be impossible to describe such a phenomenon using words, because words are symbols. So such a phenomenon would have to be radically different from anything in our day-to-day experience. It is, of course, impossible to prove that there does not exist something beyond the bounds of our imaginations. But that is what it would take to produce a machine that transcends the Turing machine. (I'm going to start referring to such a machine as a Transcendant Turing Mahchine or TTM.)

Now, there is actually a candidate for a natural phenomenon that, at least as if this writing, transcends our current understanding, and that might therefore be a TTM, or at least a signpost on the road to a TTM: the human brain. In particular, human brains produce the phenomenon of consciousness (at least mine does). Consciousness is a really weird thing, devilishly difficult to render into symbols or otherwise to get any kind of a handle on. It's hard even to demonstrate that it actually exists despite the fact that everyone professes to experience it. Consciousness at first glance has many of the characteristics we would expect from a phenomenon that indicates something happening beyond what can be described by Turing machines. (There is another phenomenon that some people claim transcends TMs, and that is the phenomenon of life itself. This is just demonstrably wrong, though demonstrating it is not easy, so that will have to wait for a future installment.)

For now I want to make one more observation: the important thing about Turing machines is not just that it is possible to build them, or even that it is possible to build a universal Turing machine. The important thing about (universal) Turing machines is that building one is not particularly difficult. Once you start building systems of any complexity at all, it is actually hard to avoid building a universal Turing machine. You could make one out of Legos or wood. The ones we build for actual use are made of sand.

The fact that UTMs are so simple and yet so powerful is a deep and profound observation not just about physical reality, but about what is logically possible. No one ever builds Turing machines according to Turing's original description except for pedagogy or as a parlor trick. The nature of Turing machines is as much in their description in terms of symbols as in the reifications of that description we humans have built in physical reality. The fact that it is hard for us to even imagine what a Transcendent Turing Machine might look like is a deep insight. It is the reason that math is so ubiquitous and effective in science.

The claim that everything that can be described can be described by a Turing machine is called the Church-Turing thesis. (The "Church" part refers to Alonzo Church.) The Thesis is not usually stated in this way; the more common formulation is that any computable function can be computed by a Turing machine. But if you think about it, this amounts to the same thing. If there were a function that could be computed (whatever that might mean) but not by a Turing machine, it would mean that there existed in this universe some physical phenomenon that could "compute" (whatever that might mean) some function that a Turing machine could not. That is to say, this phenomenon would have to exhibit some behavior that a Turing machine could not emulate.

The Church-Turing thesis cannot be proven, but a persuasive counterexample has never been demonstrated. If it's true (and if it isn't that would be Big News) then it places some hard constraints on what is possible in our universe, and that in turn has significant philosophical consequences. Either the human brain is a TTM or it is not. If it is, what exactly is happening in there that enables this transcendence? Is the same thing happening elsewhere in the universe, like maybe in the brains of dolphins or elephants? Intelligent aliens? LLMs? And if the behavior of our brains can be described by a Turing machine, what does that imply for our subjective experience of consciousness and free will? Turing machines are, after all, deterministic. If we're nothing more than machines following a program, how can we have moral agency?

Spoiler alert: these questions have answers. We humans are not TTMs but we have moral agency nonetheless. I promise I will get to that. Stay tuned.

Saturday, July 20, 2024

Sorry about the radio silence

It has been nearly two months since I last posted anything here. This is far from my longest period of radio silence, but in this case it was right in the middle of a series I had been writing on the scientific method, and I really didn't intend to stop and lose momentum. But two things happened. First, I decided to write a chapter about mathematics and that turned out to be a much more difficult subject to tackle than I thought it would be, so I've been having trouble figuring out where to go from there. And second, on June 5, my mother died. Unlike my sister's death nearly four years ago, my mom's passing was not unexpected. She was 88, and had been fighting cancer for 9 years. I spoke to her two days before she died and one of the last things she said to me was, "I've had a good life." We had no unfinished business.

However, even though my mom's death didn't come with the emotional toll that my sister's did, there is always a long list of chores that need to get done whenever someone cashed in their chips, and that has been occupying me for longer than I expected.

Anyway, just wanted to put this out there in case anyone was wondering what had become of me.

Ruth Gat 1936-2024

Friday, May 31, 2024

My (not) Twitter Account Got Hacked

My (not) Twitter (I refuse to refer to it by a single letter) account got hacked and so I got to see first-hand how utterly inadequate (not) Twitter's security measures are. The hacker immediately changed both my password and email address so I can no longer access the account and I can't do a password reset. Which is such an incredibly stupid design because of course anyone who breaks into an account is going to do those two things. It's all the more infuriating because all they would need to do to fix this is, whenever someone changes their email account, to send a message to the old account saying, "Please confirm that you want to change your email account." Or maybe just put a time delay on the change becoming effective so that when the actual owner of the account gets the notification that they have been hacked there is actually a chance that they could, you know, do something about it rather than just getting locked out immediately.

Not only am I locked out of my account, but I can't even look at my own feed to see what the hacker is posting on my behalf because (not) Twitter requires you to log in to see any of its content. So to see what the hacker is doing, I would need to create a new account.

Not that any of this matters much. I hardly ever used my account. I can't remember the last time I logged in. I think I had seven followers. But the utter stupidity of their design still steams my clams.

[UPDATE] I reported the problem to (not) Twitter tech support and it took less than five minutes to receive this reply:

We’re writing to let you know that we’re unable to verify you as the account owner. We know this is disappointing to hear, but we can’t assist you further with accessing your account.

In other words, if someone hacks your (not) Twitter account, you are just shit out of luck.

Thursday, May 30, 2024

Donald Trump is a Convicted Felon

The question of whether or not Republicans will nominate a convicted felon to be their candidate for President of the United States is no longer a hypothetical. Unless they can somehow persuade him to withdraw from the race (and good luck with that) they will have no choice. The primaries are over. The convention will just be a rubber stamp. And Trump's conviction will not disqualify him either from the ballot nor from office should he win. Which is still a very real possibility.

Which is as it should be. Because if Trump's conviction was, as he claims, a politically motivated railroading, then We The People should be able to overturn it at the ballot box.

But that's a big "if". Remember, Trump was not convicted by a judge, he was convicted by a jury of twelve ordinary citizens who voted unanimously to convict on every single one of 34 felony counts, and they did it in three days. It was not an acquittal. It was not a hung jury. It was not even close.

Think about what it would take for whoever had it in for Trump to arrange that. Every single one of the twelve jurors would have had to vote to convict despite the prosecution not having met its considerable burden of proof beyond a reasonable doubt. How likely is that?

One of the arguments that Trump's defense put forward was that Michael Cohen was lying, and one of the counter-arguments put forward by the prosecution was to ask the rhetorical question, why would Cohen lie? Sure, Cohen despised Trump, which is not surprising considering that Cohen actually went to prison for the things he did at Trumps behest. But Cohen was under oath. Would he really risk going back to prison on a perjury charge just to service a vendetta?

On the other hand, it's pretty easy to see why Trump would lie and claim he's being railroaded. He is not under oath. He is not risking perjury charges by lying. Lying his way to an election victory is pretty clearly to his own personal benefit. He has literally built a tremendously successful career on lying and covering it up. Why stop now? The strategy has never failed him before.

There are only two possibilities here: either Trump was railroaded, or he was not. If you think he was railroaded, if you really think he was innocent, and if you really think he would do a good job as President, by all means, vote for him. But if you don't, if you think the jury got it right and Trump is actually guilty, then consider that Trump is now an unrepentant convicted felon who has been lying about his innocence all along and continues to lie about it. What else might he be lying about?

I get that you might be frustrated with Joe Biden. I am too. I think the Democrats in general have their heads shoved so far up their butts that you can't see the tips of their neckties. I hate what Israel is doing in Gaza. Inflation sucks (though I feel the need to point out that the alternative was to allow the economy to collapse during the pandemic). The pronoun thing is out of control.

But Trump's conviction changes the calculus in a very important way: that he is a convicted felon is now an undeniable objective fact. There are only two possibilities: either he was railroaded, or he is a liar. There are no other options. You have to choose. If he was not railroaded, then he is a liar. And if you decide that he is a liar, then regardless of how you feel about anything else, do you really want to let someone like that to put their finger back on the nuclear button?

Monday, May 27, 2024

Truth, Math, and Models

(Part 8 in a series on the scientific method)

In the last installment I advanced a hypothesis about what truth is, which is to say, I suggested a way to explain the broad consensus that appears to exist about truth. That explanation was: there is an objective reality "out there", and true statements are those that in some sense correspond to the actual state of affairs in that objective reality. This was problematic for statements like, "Gandalf was a wizard" because Gandalf doesn't actually exist in objective reality, but that was accounted for by observing that the actual meanings of sentences in natural languages often goes beyond the literal.

But there is one aspect of truth that is harder to account for, and which would appear at first glance to be a serious shortcoming of my theory: math. Most people would consider, for example, "1+1=2" or "7 is prime" to be true despite the fact that it's hard to map those concepts onto anything in objective reality. I can show you "one apple" or "one sheep", but showing you "one" is harder. The whole point of numbers, and mathematics and logic in general, is to abstract away from the physical. Numbers qua numbers do not exist in the real world. They are pure abstraction, or at least they are supposed to be. Mathematical truth is specifically intended not to be contingent on anything in the physical world, and so it would seem that my theory of truth fails to capture mathematical truth.

Some philosophers and religious apologists claim that it is therefore impossible to ground mathematical truth in objective reality, that the existence of mathematical truth requires something more, some ethereal realm of Platonic ideals or the Mind of God, to be the source of such truths. It's a plausible argument, but it's wrong. Mathematical truth can be understood purely in terms of objective reality. Specifically, mathematics can be viewed as the study of possible models of objective reality. In this installment I will explain what I mean by that.

There are a lot of different examples of (what is considered to be) "mathematical truth" but let me start with the most basic: elementary arithmetic. These include mundane truths about numbers, things like "two plus three equals five" or "seven is prime." It would seem at first glance that numbers used in this way don't refer to anything in objective reality. I can show you two of something but I can't show you "two" in isolation.

There is an easy answer to this: numbers in common usage are not nouns, they are adjectives. The reason I can't show you "two" without showing you two of something is the same reason I can't show you "green" unless I show you a green thing. Adjectives have to be bound to nouns to be exhibited, but that doesn't mean that "green" does not exist in objective reality. It does, it's just not a thing. Green is a color, which is a property of things, but it is not itself a thing. Likewise, "two" is not thing, it is a quantity, which is a property of (collections of) things. And the reason that two plus three equals five is that if I have two things and I put them together with three other things the result is a quantity of things to which English speakers attach the label "five". Likewise "seven is prime" can be understood to mean that if I have a quantity of things to which English speakers attach the label "seven" I cannot arrange those things in a complete, regular rectangular grid in any way other than the degenerate case of putting them all in a line.

But this explanation fails for straightforward extensions of the natural numbers, like negative numbers or irrational numbers or imaginary numbers. I can show you two apples, and I can explain addition and subtraction in terms of putting groups of apples together and taking apples away, but only for positive numbers. I cannot explain "three minus five equals negative two" in terms of starting with three apples and taking five away because that is just not physically possible. Likewise I cannot show you a square with a negative area, and so I cannot explain the square roots of negative numbers in terms of anything physical (at least not easily).

There are two more cases where the numbers-are-adjectives theory fails. The first is truths that involve generalizations on numbers like "There are an infinite number of primes." That can't be explained in terms of properties of physical objects because we live in a finite universe. There are not an infinite number of objects, so if numbers are meant to describe a quantity of a collection of actual physical objects, then there cannot be an infinite number of them either.

Finally, there are a lot of objects of mathematical study beyond numbers: manifolds, tensors, vectors, functions, groups, to name just a few. Some of these areas of study produce mathematical "truths" that are deeply weird and unintuitive. The best example I know of is the Banach-Tarski "paradox". I put "paradox" in quotes because it's not really a paradox, just deeply weird and unintuitive: it is possible to decompose a sphere into a finite number of parts that can be reassembled to produce two spheres, each the same size as the original. That "truth" cannot be explained in terms of anything that happens in objective reality. Indeed, the reason this result seems deeply weird and unintuitive is that it appears to directly contradict what is possible in objective reality. So the Banach-Tarski "paradox" would seem to be a counter-example to any possible theory of how mathematical truth can be grounded in objective reality. And indeed it is a counter-example to the idea that mathematical truths are grounded in actual objective reality, but that is not news -- we already established that with the example of negative numbers and imaginary numbers.

I've already tipped my hand and told you that (my hypothesis is that) mathematics is the study of possible models of objective reality. To defend this hypothesis I need to explain what a "model" is, and what I mean by "possible" in this context.

A model is any physical system whose behavior correlates in some way with another physical system. An orrery, for example, is a model of the solar system. An orrery is a mechanical model, generally made of gears. It is the actual physical motion of the gears that corresponds in some way to the actual physical motion of the planets.

Mathematics is obviously not made of gears, but remember that mathematics is not the model, it is the study of (possible) models (of objective reality). So the study of mechanical models like orreries falls under the purview of mathematics. Mathematics obviously transcends the study of mechanical models in some way, but you may be surprised at how closely math and mechanism are linked historically. Math began when humans made marks on sticks (or bones) or put pebbles in pots to keep track of how many sheep they had in their flocks or how much grain they had harvested. (These ancient roots of math live on today in the word "calculate" which derives from the latin word "calculus" which means "pebble".) And mathematics was closely linked to the design and manufacture of mechanical calculating devices, generally made using gears just like orreries, right up to the middle of the 20th century.

There is another kind of model besides a mechanical one: a symbolic model. Mathematics has its roots in arithmetic which has its roots in mechanical models of quantities where there was a one-to-one-correspondence between marks-on-a-stick or pebbles-in-a-pot and the things being counted. But this gets cumbersome very quickly as the numbers get big, and so humans came up with what is quite possibly the single biggest technical innovation of all time: the symbol. A symbol is a physical thing -- usually a mark on a piece of paper or a clay tablet, but also possibly a sound, or nowadays a pattern of electrical impulses in a silicon chip -- that is taken to stand for something to which that mark bears no physical resemblance at all. The familiar numerals 0 1 2 3 ... 9 are all symbols. There is nothing about the shape of the numeral "9" that has anything to do with the number it denotes. It's just an arbitrary convention that 9 means this many things:

@ @ @ @ @ @ @ @ @

and 3 means this many things:

@ @ @

and so on.

Not all symbols have straightforward mappings onto meanings. Letters, for example, are symbols but in general they don't mean anything "out of the box". You have to assemble letters into words before they take on any meaning at all, and then arrange those words into sentences (at the very least) in order to communicate coherent ideas. This, too, is just a convention. It is not necessary to use letters, and not all languages do. Chinese, for example, uses logograms, which are symbols that convey meaning on their own without being composed with other symbols. And symbols don't have to be abstract either. Pictograms are symbols that communicate meaning by their physical resemblance to the ideas they stand for.

Mathematical symbols work more like logograms than letters. A mathematical symbol like "3" or "9" or "+" generally conveys some kind of meaning by itself, but you have to compose multiple symbols to get a complete idea like "3+2=5". Not all compositions of symbols result have coherent meanings, just as not all compositions of letters or words have coherent meanings. There are rules governing how to compose mathematical symbols just as for natural language. "3+2=5" is a coherent idea (under the usual set of rules) but "325=+" is not.

There is a further set of rules for how to manipulate mathematical symbols to produce "correct" ideas. An example of this is the rules of arithmetic you were taught in elementary school. The result of manipulating numerals according to these rules is a symbolic model of quantities. There is a correspondence between strings of symbols like "967+381=1348" and the behavior of quantities in objective reality. Moreover, manipulating symbols according to these rules might seem like a chore, but it is a lot easier to figure out what 967+381 is by applying the rules of arithmetic than by counting out groups of pebbles.

It turns out that manipulating symbols according to the right rules yields almost unfathomable power. With the right rules you can produce symbolic models of ... well, just about anything, including, but not limited to, every aspect of objective reality that mankind has studied to date (with the possible exception of human brains -- we will get to that later in the series).

Mathematics is the study of these rules, figuring out which sets of rules produce interesting and useful behavior and which do not. One of the things that makes sets of rules for manipulating symbols interesting and useful is being able to separate string of symbols into categories like "meaningful" and "meaningless" or "true" and "false". Sometimes, for sets of rules that produce models of objective reality, "true" and "false" map onto things in objective reality, and sometimes they are just arbitrary labels.

The canonical example of this is Euclid's fifth postulate: given a line and a point not on that line, there is exactly one line through the given point parallel to the given line. For over 2000 years humans believed that to be true and were vexed when they couldn't find a way to prove it. It turns out that it is neither true nor false but a completely arbitrary choice; you can simply choose whether the number of lines through a point parallel to a given line is one or zero or infinite. Any of those three choices leads to useful and interesting results. As a bonus, some of them turn out to be good models of some aspects of objective reality too.

Another way of looking at it is that mathematics looks at what happens when you remove the constraints of physical reality from a set of rules that model that reality. More often than not it turns out that when you do this, what you get is a system that is useful for modeling some other aspect of reality. Sometimes that aspect of reality is something that you would not have even suspected to exist had not the math pointed you in that direction.

An example: arithmetic began as a set of rules for counting physical objects. You cannot have fewer than zero physical objects. But you can change the rules of arithmetic to behave as if you could have fewer than zero objects by introducing "the number that is one less than zero" a.k.a. negative one. Even though that concept is patently absurd from the point of view of counting apples or sheep, it turns out to be indispensable when counting electrical charges or keeping track of financial obligations. So is it "true" that (say) three minus five equals negative two? It depends on what you're counting. Is it "true" that there are an infinite number of primes? It depends on your willingness to suspend disbelief and imagine an infinite number of numbers even though most of those could not possibly designate any meaningful quantity of physical objects in our finite universe. It the Banach-Tarski paradox "true"? It depends on whether or not you want to accept the Axiom of Choice. (And if you think the AoC seems "obviously true" then you should read this.)

There are many examples of alternatives to the usual rules of numbers that turn out to be useful. The most common example is modular arithmetic, which produces useful models of things like time on a clock, days of the week, and adding angles. Another example is p-adic numbers, which are like modular arithmetic on steroids. It is worth noting that in modular arithmetic, some arithmetic truths that are often taken as gospel turn out not to be true. For example, in base 7, the square root of two is a rational number (not just rational but an integer!).

Philosophers and religious apologists often cite mathematical "truths" as somehow more "pure" than empirical truths and our ability to perceive them to be evidence of the existence of God or some other ethereal realm. Nothing could be further from the (empirical!) truth! In fact, all mathematical "truths" are contingent, dependant on a set of (mostly tacit) assumptions. Even the very concept of truth itself is an assumption!

With that in mind, let us revisit the liar paradox, to which I promised you an answer last time. I'll use the two-sentence version since that avoids technical issues with self-reference:

1. Sentence 2 below is false.

2. Sentence 1 above is true.

The puzzle is how to assign truth values to those two sentences. The reason it's a puzzle is that there are two tacit assumptions that people bring to bear. The first is the Law of the Excluded Middle: propositions are either true or false. They cannot be both, and they cannot be neither. A simple way to resolve the paradox is simply to discharge this assumption and say that propositions can be half-true, and that being half-true is the same as being half-false.

The second tacit assumption that makes the Liar paradox paradoxical is the assumption that the truth values of propositions must be constant, that they cannot change with circumstances. This is particularly odd because everyday life is chock-full of counterexamples. In fact, the vast majority of propositions that show up in everyday life depend on circumstances. "It is raining." "I am hungry." "It is Tuesday." The truth values of those all change with circumstances. Obviously, "It is Tuesday" is only true on Tuesdays. Why cannot the truth values of the Liar paradox do the same thing? We can re-cast it as:

1. At the moment you contemplate the meaning of sentence 2 below, it will be false.

2. At the moment you contemplate the meaning of sentence 1 above, it will be true.

The truth values then flip back and forth between true and false as you shift the focus of your contemplation from one to the other. Note that both of these solutions can also be applied to the "this sentence is true" version, where all three of "true", "false" and "half-true/half-false" produce consistent results (though of course not at the same time).

Finally, note that we can also attack the Liar paradox experimentally by building a physical model of it. There are many ways to do this, but any physical mechanism that emulates digital logic will do. You could build it out transistors or relays or Legos. All you need to do is build an inverter, a device whose output is the opposite of its input. Then you connect the output to the input and see what happens.

In the case of a relay, there is enough mechanical delay that the result will be flipping back and forth. It will happen fast enough that the result will sound like a buzzer, and indeed back in the days before cheap transistors this is often how actual buzzers were made. If you build this circuit out of transistors then the outcome will depend a lot on the details, and you will end up with either an oscillator or a voltage that is half-way between 1 and 0.

If you put two inverters in series and connect the final output to the initial input you will have built a latch, which will stay at whatever condition it starts out in. This is how certain kinds of computer memory are made.

The modeling train runs in both directions. This will become important later when we talk about information. But that will have to wait until next time.

Saturday, May 18, 2024

A Scientific Theory of Truth

(This is part 7 of a series about the scientific method.)

The over-arching theme of this series is that science can serve as a complete worldview, that it can answer deep philosophical and existential questions normally associated with philosophy or religion. I gave a small example of that in the last installment where I showed how the scientific method can be deployed to answer a fun philosophical riddle. Here I want to show how it can tackle a much deeper question: what is truth?

Note that what I mean by "truth" here in this chapter -- and only this chapter -- is not scientific truth, but philosophical truth. Remember, I have already disclaimed the idea that science finds philosophical or metaphysical truth. It doesn't, it finds good explanations that account for observations, which are valuable because one of the properties of a good explanation is that it has predictive power. But a good explanation is not necessarily true in the philosophical sense. Newton's theory of gravity, for example, turns out to be completely wrong from a philosophical point of view, but it is still useful because it makes accurate predictions nonetheless.

The word "truth" is sometimes used in science as a shorthand for "theory that is sufficiently well established and makes sufficiently accurate predictions under a sufficiently broad range of circumstances that we proceed as if it were the (philosophical) truth even though we know it's not." When I want to emphasize that I am referring to philosophical or metaphysical truth I will capitalize it. Science seeks (and finds!) lower-case-t truth, not upper-case-t Truth.

But this does not mean that (upper-case-t) Truth is beyond the realm of scientific inquiry! Remember, the scientific method is to find the best explanation that accounts for all of your observations, and one of your observations (if you are a normal human) is a constant stream of overwhelming evidence that there is lower-case-t truth out there, and the obvious explanation for that is that there is upper-case-T Truth out there, and the former is somehow a faithful reflection of the latter. Furthermore, what we know about lower-case-t truth can put constraints on upper-case-T truth. Science might not be able to tell us what the Truth is, but it can tell us with very high confidence what it is not.

So how do we use the scientific method to tackle a philosophical problem? We can't do it directly because "What is Truth?" is not a properly framed scientific question. Scientific inquiry must begin with a Problem, something we observe that cannot be explained by current theories. "What is Truth?" is not a valid Problem statement because Truth is not something we observe. To start a scientific inquiry we have to somehow transform this into a question about something we can observe.

Fortunately, there is a general way of doing this for philosophical questions: we can observe that people wonder about what Truth is! We can further observe that people have some intuitions about what Truth is (or at least what truth is), and that some of these intuitions are common across a wide swath of humanity, to the point where someone who does not share these intuitions can be considered mentally ill. For example, here are some things that are widely regarded as true:

All triangles have three sides.

The sun rises in the east.

The sky is blue.

Humans are mortal.

And here are a few examples of statements that are widely regarded to be not true, a.k.a. false:

Some triangles have four sides.

The sun rises in the west.

The sky is green.

Humans are immortal.

We can now advance a some hypotheses to explain these observations:

There is a metaphysical Truth out there, and that out intuitions are somehow in contact with this Truth.
Our intuitions about truth are illusions. There are no actual truths. What we call "truth" is nothing more than a social construct into which we are indoctrinated. (If you think this sounds ridiculous, believe me, I sympathize. But this is actually a hypothesis that is taken seriously in some academic circles. It's called post-modernism.)

Before we go on to discuss the relative merits of these hypotheses (though I guess I've already tipped my hand here), let's consider some more examples:

Richard Nixon had eggs for breakfast on the morning of January 1, 1962.
Coffee tastes good.
The United States is a Christian nation.
Gandalf was a wizard.
Love is a many-splendored thing.
Die Erde ist Rund.
This sentence is false.
Given a line and a point not on that line, there is exactly one line passing through the given point parallel to the given line.

Each of these examples is meant to illustrate a different subtlety with the notion of "truth". The first one is either true of false, i.e. there is an actual fact of the matter regarding whether or not this statement is true, but that fact is almost certainly beyond our reach. We can know that this statement is either true or it is false, but almost certainly we can't know which.

The second example is completely different. It is an example of a subjective claim. There is no "actual fact of the matter" with regards to the taste of coffee. Some people like it, some don't. Note that you can transform a subjective claim into an objective one by binding it to a particular person: "Coffee tastes good" is subjective, but "I think coffee tastes good" or "coffee tastes good to me" is objective.

The third example is harder to characterize. At first glance is might appear to be a subjective claim, a matter of opinion analogous to the flavor of coffee. But consider "Israel is a Jewish state", or "Saudi Arabia is a Muslim state." Surely those are objectively true? Surely there is more truth to them than "Israel is a Muslim state" or "Saudi Arabia is a Jewish state"? I won't say anything more about this example right now, but keep it in the back of your mind because it will become important later in this series.

The fourth example is actually controversial, at least among philosophers. If you polled ordinary people on the street you would probably find pretty overwhelming agreement that this statement is true, especially when contrasted with, say, "Gandalf was an orc." But some philosophers argue that any statement about Gandalf cannot be true because Gandalf doesn't actually exist, and non-existent things can't have properties. So it cannot be true that Gandalf was a wizard, but neither is it true that he was not a wizard. He simply wasn't anything. Personally, I think this is ridiculous, and I would not even mention it but for the fact that this point of view was advanced in all seriousness by someone whose views I otherwise hold in the highest regard.

To me, the answer is pretty obvious: the sentence "Gandalf was a wizard" does not mean that Gandalf was an actual wizard in actual physical reality. It means that within the context of the fictional world created by J.R.R. Tolkien, Gandalf was a wizard. Or, if you really want to be strict about it, "J.R.R. Tolkien wrote that Gandalf was a wizard", which is clearly true.

The fifth example appears superficially to be a factual claim, but it isn't. It's poetry. I included it to show that there is something about factual claims that transcends mere syntax.

The sixth example is German for "the earth is round." This example makes a similar point to the one before: the process of deciding whether or not a statement is true, or even of deciding whether or not it even makes sense to say that it is true or false, is not a simple one. There is no straightforward procedure that you can apply to a string of letters to decide these things.

The seventh example is the famous Liar Paradox. Superficially it appears that this sentence should be either true or false, but either possibility leads immediately to a contradiction. Another interesting example, which is not considered nearly as often but which I think is equally interesting, is the opposite: "This sentence is true." It can be true, or it can be false, and both possibilities are internally consistent.

This is also the case for the last example, which is the famous Euclid's fifth postulate. Intuitively it seems like it should be true, and it also seems like it should be provable from some simpler assumptions, but humans searched in vain for such a proof for two thousand years before realizing that whether or not to consider this statement true or false is an arbitrary choice, and either choice leads to interesting and useful results.

The main point I'm trying to make here is that truth is complicated. The road to truth winds through the vagaries of natural language and subjective experience, takes a few twists and turns through prejudice and personal opinion, before finally arriving...well, somewhere. My personal experience is consistent, at least superficially, with the hypothesis that there is an objective reality "out there" which I share with other conscious beings. Specifically, I am something called a "human being" living on the surface of something called a "planet" which I share with other human beings who do things like eat and sleep and build computers and write blog posts. If you don't accept that, then I'd be interested to know how you account for what you are doing right this very moment as you read this.

For now, though, I am just going to assume that this is the case. This is an example of engaging in Step 2 of the scientific method. The first step is to identify a Problem. In this case, the Problem is to account for the fact that humans profess to believe, sometimes vehemently, that certain things are true and other things are false. It's possible that this is because all humans other than me are NPCs in a simulation, and I'm the only truly conscious being in the universe. (This is called solipsism.) It's hard (though not impossible) to refute solipsism, but here I'm not even going to try. I'm just going to assume it's false, and that all my fellow humans really are what they appear to be.

So here is the actual hypothesis I am willing to defend, my proposed answer to the question of "What is truth?"

Ron's Theory of Truth: Truth is a property of propositions, which are ideas that stand in relation to some circumstance in objective reality (whose existence we have assumed for the sake of argument). If the circumstance corresponding to that proposition actually pertains, then the proposition is true, otherwise it is false.

Notice that I have introduced two new words here: "proposition" and "idea". I don't want to get into too much detail about these just now. My purpose here is not to present an academically rigorous argument, merely to illustrate in broad brushstrokes how the scientific method can be used to attack problems that are often regarded as outside of the scope of scientific inquiry. I will get much more precise about this later in this series, but for now just assume that "idea" means what it is commonly taken to mean: some vaguely identifiable thing that exists in someone's mind which can be somehow transferred into another person's mind. This transferability is the defining characteristic of an idea; it is what distinguishes ideas from other things that might exist in someone's mind, like emotions or self-awareness.

Despite the fact that the idea of an idea (!) is pretty common, it is actually very hard to demonstrate. I can't show you an actual idea. All I can show you is a rendering or representation of an idea through some physical medium like writing or speech or dance or music. So, for example, I can write:

The earth is round.

That looks like an idea, but appearances are deceiving. What you are looking at is not something inside my brain (which is where ideas live) but patterns of light emitted from your computer screen, which, needless to say, is not part of my brain. The marks you see on the screen get translated by your brain into an idea, but the marks and the idea are not the same thing. The idea is the thing that ends up in your brain after seeing the marks, which in this case your brain interprets as letters and words. Compare with:

Die Erde ist Rund.

地球は丸い

Those are completely different marks on the page, but they all denote the same idea, namely, that the earth is round. That idea is a proposition because it maps onto things in objective reality, namely a planet called (in English) "earth" and a physical property called (again, in English) "round". And that proposition is true because that thing actually has that property.

Note that evaluating the truth of statements on my theory is a two-step process. The first step is mapping a rendering or a representation of an idea, which here will almost always be marks on a computer screen, onto an actual idea, and the second step is mapping that idea onto reality. The importance of the first step cannot be overstated. It is capable of cutting huge swathes through the philosophical underbrush. Many seemingly intractable philosophical problems fall before it.

Take the example of "Gandalf was a wizard." That string can reasonably be interpreted in two different ways, one of which is true, and the other false. It can be taken to mean, "Gandalf was a real person in the real world, and he was a wizard" or it can be taken to mean, "Gandalf, the fictional character in J.R.R. Tolkein's 'Lord of the Rings' was, within the context of that fictional world, a wizard." Disagreements over the truth of "Gandalf was a wizard" are nothing more than quibbles over this particular ambiguity of the English language.

The example of "all triangles have three sides" can be resolved similarly. The word "triangle" means "a shape with three sides" and so this statement really means "All shapes with three sides have three sides." Which is pretty obviously true, and also not very interesting.

A more interesting example is "The United States is a Christian Nation." This turns on what is meant by the ambiguous phrase, "Christian nation." It might mean that the United States is a Christian theocracy, which it is not (at least not yet). It might mean that the United States was intended to be a Christian theocracy, which it also was not. Or it might mean that the majority of the people living in the United States self-identify as Christian, which is true. Again, disagreements over this are disagreements over the meanings of words, not good-faith disagreements over actual facts.

Good-faith disagreements over actual facts are very rare in science. This is one of the reasons that scientific disagreements are almost invariably settled without resorting to violence, which is in very stark contrast to other methods that humans have tried.

As an exercise, see how far you can get following this line of thought to attack the Liar Paradox, i.e. "This statement is false." I'll give the answer to that in a future installment because this one is getting too long. But as a hint, here are two incorrect answers.

Most people upon seeing this puzzle for the first time think that the resolution has something to do with the self-referential nature of the phrase "this statement". That's not the case. It's straightforward to construct a similar paradox without any self-reference. Here is one way:

The following sentence is false.
The preceding sentence is true.

We can even stop playing fast-and-loose with the distinction between strings and propositions simply by giving names to strings:

S1: "The proposition denoted by string S2 is false."
S2: "The proposition denoted by string S1 is true."

And we can even do the same thing without labels by using a clever trick called Quining, which consists of filling in the details of a string that looks something like, "The string that you get when you follow this procedure ... is false", where the ellipses are filled in with instructions such that the string that you get when you follow those instructions are the exact string that you started with. It's quite a neat trick, and it's the foundation of Godel's famous incompleteness theorem, wherein he constructs a string that essentially says, "This proposition denoted by this string cannot be proven by standard mathematics."

So self-reference is not the problem.

A second possibility is that the liar string does not denote a proposition. Just because a sentence bears a superficial resemblance to a proposition doesn't mean it actually is one. "Love is a many-splendored thing" bears a superficial resemblance to "love is an emotion", but the latter denotes a proposition while the former does not. In order to be a proposition on my theory, an idea has to refer to objective reality somehow because truth is determined by correspondence to reality. The words "love" and "emotion" refer to things in reality, but "many-splendored thing" does not; it's just a poetic rhetorical flourish.

No such problems are immediately evident in the Liar Paradox sentence. It refers to an idea, and ideas are part of reality, and so we cannot reject it as a proposition on the grounds that it does not refer to reality.

I'll give you one final hint: the resolution of the liar paradox involves discharging a tacit assumption about propositions which turns out not to be true according to the definition of truth I've advanced here. If you think you know the answer, put it in the comments. (Note that I have comment moderation turned on. This blog has been around for twenty years and it attracts a lot of spambots.)

In closing, I want to reiterate that the main point here is not to resolve the liar paradox (that's just a fun puzzle that happened to come up) nor any other hard philosophical problem, but merely to show how the scientific method can be applied to such problems. Philosophers have been puzzling over what truth is for millennia; I can't provide an academically rigorous answer in 2500 words. The best I can hope for is to show that these questions are not beyond the scope of scientific inquiry.

But stay tuned. There's more to come.

Sunday, May 05, 2024

Languages are theories: debunking the new riddle of induction and flat-eartherism

This is the sixth installment in a series about the scientific method. My central thesis is that science is not just for scientists, it can be used by anyone in just about any situation.

In part 2 of this series I gave a few examples of how the scientific method can be applied in everyday situations. In this chapter I want to show how it can be used to tackle what is considered to be a philosophical problem, something called the New Riddle of Induction. I already covered the "old" riddle of induction in an earlier chapter but I'm going to go back over it here in a bit more detail.

The "old" problem of induction is this: we are finite beings. There are only a finite number of us humans. Each of us only lives for a finite amount of time, during which we can only have a finite number of experiences and collect a finite amount of data. How can we be sure that the theories we construct to explain that data don't have a counter-example that we just haven't come across yet?

The reason this is called the "problem of induction" is that the example most commonly used to motivate it is the (alleged) "fact" that all crows are black. It turns out that this isn't true. There are non-black crows, but they are rare. If all the crows you have ever seen are black, then it seems not entirely unreasonable for you to draw the conclusion that all crows are black because you have never seen a counter-example. But of course you would be wrong.

The "new" riddle of induction (NRI) was invented by Nelson Goodman in 1955. It adds a layer to this puzzle by defining a new word, "grue', as follows:

An object is grue if and only if it is observed before midnight, December 31, 2199 and is green, or else is not so observed and is blue.

Goodman then goes on to observe that every time we see a green emerald before December 31, 2199, that is support for the hypothesis that all emeralds are green, but it is equally good support for the hypothesis that all emeralds are grue, and so we are equally justified in concluding that at the stroke of midnight on new years eve 2199, all of the world's emeralds will suddenly turn blue as we are in predicting that they will remain green.

Now, of course this is silly. But why is it silly? You can't just say that the definition of "grue" is obviously silly, because we can give an equally silly definition of the word "green". First we define "bleen" as a parallel to "blue":

An object is bleen if and only if it is observed before midnight, December 31, 2199 and is blue, or else is not so observed and is green.

And now it is "green" and "blue" that end up with the silly-seeming definitions:

An object is green if and only if it is observed before midnight, December 31, 2199 and is grue, or else is not so observed and is bleen.

An object is blue if and only if it is observed before midnight, December 31, 2199 and is bleen, or else is not so observed and is grue.

The situations appear to be completely symmetric. So on what principled grounds can be say that "grue" and "bleen" are silly, but "blue" and "green" are not?

You might want to take a moment to see if you can solve this riddle. Despite the fact that philosophers have been arguing about it for decades, it's actually not that hard.

It is tempting to say that we can reject the grue hypothesis because it has this arbitrary time, midnight, December 31, 2199, baked into the definition of the words "grue" and "bleen", so we can reject it for the same reason we rejected last-thursdayism. The grue hypothesis (one might argue) is not one hypothesis, it is just one of a vast family of hypotheses, one for every instant of time in the future. In fact, if you look up the NRI you will find the definition of grue given not in terms of any particular time, but explicitly in terms of some arbitrary time called T.

This explanation is on the right track, but it's not quite right because, as I pointed out earlier, the green hypothesis can also be stated in terms of some arbitrary time T. What is it about "green" that makes it more defensible as a non-silly descriptor than "grue"?

Again, see if you can answer this yourself before reading on.

The answer is that while it is possible to give a silly definition of "green" in terms of grue and bleen, it isn't necessary. It is possible to give a non-silly definition of "green"; it is not possible to give a non-silly definition of grue. It is possible to define "green" without referring to an arbitrary time; it is not possible to define grue without referring to an arbitrary time.

How can we know this? Because the grue hypothesis makes a specific prediction that the green hypothesis does not, namely, that all of the emeralds discovered after time T will be blue, which is to say, they will be a different color than all of the emeralds discovered before T.

Goodman would probably reply: no, that's not true, all of the emeralds discovered before and after time T will be the same color, namely, grue. But this is just word-play. If you take two emeralds, one discovered before T and one after, they will look different. If you point a spectrometer at a before-T emerald and an after-T emerald, the readings will be different. In other words, on the grue hypothesis you will be able to distinguish experimentally between emeralds discovered before T and after T. The grue hypothesis is falsifiable, and it will almost certainly be falsified the first time someone discovers an emerald after time T.

The crucial thing here is that your choice of terminology is not neutral, it is a crucial component of the expression of your hypothesis. To quote David Deutsch, in an aphorism that arguably sets the record for packing the greatest amount of wisdom into the fewest number of words: languages are theories. An argument based on hiding questionable assumptions under a terminological rug can be rejected on that basis alone.

Here is another example: consider, "The sun rises in the east." Most people would consider that to be true. But if you think about it critically, this sentence is laden with hidden assumptions, not least of which is (at least apparently) that the sun rises. It doesn't. The sun just sits there, and the earth orbits around it while rotating on an axis. That makes it appear, to an observer attached to the surface of the earth, that the sun rises and sets even though it actually doesn't. But that doesn't make "the sun rises in the east" false, it is just a deliberate misinterpretation of what those words actually mean in practice. "The sun rises in the east" does not mean that the sun literally rises, it means that the sun appears to rise (obviously), and it does so in the same general direction every morning. There is also an implicit assumption that we are making these observations from non-extreme latitudes. At extreme latitudes, the sun does not even appear to rise in the east. In fact, at the poles, the concepts of "east" and "west" don't even make sense -- at the poles, east and west literally do not exist! (By the way, this is not just a trivial detail. This exact same thing will come up again when we start talking about space-time, cosmology, and the origins of the universe.)

Note that "the sun rises in the east" is not an inductive conclusion, nor is it a hypothesis. It is a prediction, one of many, made by the theory that the earth is a sphere rotating about an axis. Furthermore, the fact that the sun rises and sets, together with the fact that this happens at different times in different places, definitively debunks the competing hypothesis that the earth is flat. On a flat earth, if the sun is above the horizon, it must be above the horizon for all observers. If the sun is below the horizon, it must appear below the horizon for all observers, and likewise if the sun is at the horizon. This is in conflict with the observation that sunrise and sunset happen at different times in different locations.

Similarly, "all crows are black" is neither an inductive conclusion nor a hypothesis, but a prediction made by a very complex set of theories having to do with how DNA is transcribed into proteins, some of which absorb light of all visible wavelengths and so appear to be black. "All emeralds are green" works the same way, but with one important distinction worth noting: in the case of crows, the hypothesis admits the possibility of occasional genetic mutations that result in non-black crows, which is in fact exactly what we sometimes observe. (It also predicts that these will be rare, which is also what we observe.)

Emeralds are different. Emeralds are green not because they contain proteins produced by DNA, but because they consist of particular kinds of atoms arranged in a particular crystalline structure with some particular impurities that make them look green. It is possible to have other impurities that produce other colors, but in that case the result is not called an emerald but aquamarine or morganite. All emeralds are green, without exception, because that is consequence of the definition of the word "emerald" plus the known laws of physics. If a non-green emerald were ever discovered, that is, a mineral with the same chemical composition and crystal structure as an emerald but which was not green, that would be Big News.

Notice how easy all this was. We didn't have to do any math. We didn't have to get deep into the weeds of either scientific or philosophical terminology. The hairiest technical terms I had to use to explain all this were "chemical composition" and "crystalline structure".

Notice too that we didn't have to debunk any of the specific arguments advanced by flat-earthers. All we had to do is think about what "the sun rises in the east" actually means, and combine that with the fact that time zones are a thing, to generate an observation that the flat-earth hypothesis cannot explain. Unless and until flat-earthers refute that (and they won't because they can't) we can confidently reject all of their arguments even if we have not examined them in detail, just as we can confidently reject claims of perpetual motion even if we have not examined those claims in detail.

In fact, we can reject flat-eartherism even more confidently than we can reject perpetual motion, and that is really saying something. There are possible worlds where the second law of thermodynamics doesn't apply, the world we live in just happens not to be one of them. It is a logical impossibility for the sun to rise and set at different times on a flat earth. Simultaneous sunset and sunrise for all observers is a mathematical consequence of what it means to be flat.

The take-away message here is that the choice of terminology, the concepts you choose to bundle up as the definitions of words, is an integral part of the statement of a hypothesis. Often the entire substance of a hypothesis is contained not in the statement of the hypothesis, but in the definitions of the words used to make the statement.

There are all kinds of problems that philosophers have argued about for decades that are easily resolved (and also bad science pedagogy that is easily recognized) once one comes to this realization. It is a hugely empowering insight. If someone tries to explain something science-y to you and it doesn't make any sense, it very well might just be that they haven't explained what they mean by the words they are using. Science is chock-full of specialized terminology, and a lot of it sounds intimidating because, for historical reasons, scientists have adopted words with Greek and Latin roots (and sometimes German too). These can sound weird, but the important thing to remember is that even weird-sounding words are just words, and they mean things just like more familiar words, and the things that they mean are often not nearly as intimidating as the words themselves. Don't let weird words scare you.

The same can be said for math. A lot of people are put off from science because it tends to have a lot of math, which they find to be off-putting. But here is the empowering secret about math: math is just another language! It is a very a very weird and specialized language, but a language nonetheless. It uses a lot of unfamiliar symbols and notational conventions (the relative placement of symbols on a page matters a lot more than in other languages) but at the end of the day it's just marks on a page that mean something, and it's the meaning that matters, not the marks. Keep that in mind any time things start to feel like they're getting too complicated.

[UPDATE] There are actually a lot of adjectives in English that act like "grue" and change their underlying meanings depending on time or other circumstances: normal, average, extraordinary, fashionable, affordable, polite, misspelled, technologically-advanced... There are also some shape-shifting nouns, with the most prominent examples being "here" and "now". What is it that makes these less silly than "grue" and "bleen"? It is very simple: the changes captured by the definitions of these words reflect actual changes that happen in the world while the changes captured by the definitions of "grue" and "bleen" do not.

Monday, April 29, 2024

The Scientific Method part 5: Illusions, Delusions, and Dreams

(This is the fifth in a series on the scientific method. )

Daniel Dennett died last week. He was a shining light of rationality and clarity in a world that is often a dark and murky place. He was also the author of, among many other works, Consciousness Explained, which I think is one of the most important books ever written because it gives a plausible answer to what seems like an intractable question: what is consciousness? And the answer is, to the extent that it is possible to condense a 450-page-long scholarly work down to a single sentence: consciousness is an illusion.

I don't know if Dennett would have agreed with my précis, and I don't expect you to believe it just because I've proclaimed it, or even because Dennett very ably (IMHO) defends it. I might be wrong. Dennett might be wrong. You can read the book and judge for yourself, or you can wait for me to get around to it (and I plan to). But for now I just want to talk about what illusions actually are and why they matter for someone trying to apply the scientific method. In so doing I hope to persuade you only that the hypothesis that consciousness is an illusion might not be as absurd as it seems when you first encounter it. I am not hoping to convince that it's true here -- that is much too big a task for one blog post -- only that the hypothesis is worthy of consideration and further study.

You are almost certainly reading this on a computer with a screen. I probably don't have to convince you that that screen is a real thing. But why do I not have to convince you of that? Why can I take it for granted that you believe that your computer screen is a solid tangible object that actually exists?

The answer can't be merely that you can see it. In fact, if you are reading this, you probably actually can't see most of your computer screen! What you are seeing instead is an image on your computer screen, and that image is not a real thing. Here, for example, is something that looks like a leopard:

but it is not a leopard, it is a picture of a leopard, and a picture of a leopard is not a leopard. The latter is dangerous, the former not so much. But the point is that, when a computer screen is in use, most of it does not look like a computer screen, it looks like something else. The whole point of a computer screen is to look like other things. Computer screens are the ultimate chameleons.

As I write this, it is still pretty easy to distinguish between real and image-inary leopards (and even imaginary leopards), but that may not be the case much longer. Virtual reality headsets are becoming quite good. I recently had an opportunity to try an Apple Vision Pro and it was a transformative experience. While I was using it, I genuinely thought I was looking through a transparent pane of glass out onto the real world. It was not until later that I realized that this was impossible, and what I was seeing was an image projected into two very small screens. God only knows where this technology will be in another few decades.

Now take a look at this image, which is called "Rotating Snakes":

If you are like most people, you will see the circles moving. (If you don't see it, try looking at the full-size image.) Since you are almost certainly viewing this on a computer screen, a plausible explanation is that the image actually is moving, i.e. that it is not a static image like the leopard photo but a video or an animated gif, like this

But that turns out not to be the case. The Rotating Snakes image is static. There are a couple of ways to convince yourself of this. One is to focus on very small parts of the image rather than the whole thing at once, maybe even use a sheet of paper with a hole cut in it to block out most of the image. Another is to print the image on a sheet of paper and look at it there.

The motion you see (again, if you are typical) in Rotating Snakes is an example of an illusion. An illusion is a perception that does not correspond to actual reality. Somehow this image tricks your brain into seeing motion where there is none. The feeling of looking out at the world through a pane of transparent glass in a Vision Pro is also an illusion. And in fact the motion you see in the animated gif above is also an illusion. That image is changing, but it's not actually moving. And even if we put that technicality aside, you probably see a rotating circle, but the white dots that make up that circle are actually all moving in straight lines.

The Rotating Snakes image is far from unique. Designing illusions is an entire field of endeavor in its own right. Illusions exist for all sensory modalities. There are auditory illusions, tactile illusions, even olfactory illusions. The first impression of your senses is not a reliable guide to what is really out there.

There are two other kinds of perceptions besides illusions that don't correspond to reality: dreams and delusions. You are surely already familiar with dreams. They are a universal human experience, and they happen only while you are asleep. But one of the characteristics of dreams is that you are generally unaware that you are asleep while you are dreaming. Dreams can feel real at the time. It is possibly to become aware that you are dreaming while you are dreaming. These are called "lucid dreams". They are rare, but not unheard of, and some people claim that you can improve your odds of experiencing one with practice. I've had a few of them in my life, and they can be a lot of fun. For a little while I feel like I am living in a world where real magic exists, and I can do things like fly simply by thinking about it.

But then, of course, I always wake up.

This is the thing that distinguishes dreams from illusions and delusions: dreams only happen when you are asleep. Illusions and delusions happen when you are awake. The difference between illusions and delusions is that delusions, like dreams, are private. They are only experienced by one person at a time, and they are not dependent on any external sensory stimulus.

The word "delusion" is sometimes understood to be pejorative, but it need not be. Delusions are a common part of the human experience. Tinnitus and psychosomatic pain are delusions but the people who suffer from them are not mentally ill or "deluded". Even schizophrenics are not necessarily "deluded" -- many schizophrenics know that (for example) the voices they hear are not real, just as people with tinnitus (I am one of them) know that the high-pitched squeal they experience is not real. What drives them (us?) crazy is that they (we?) can't turn these sounds off.

Delusions don't even have to be unpleasant. They can be induced by psychoactive chemicals like LSD, and (I am told -- I have not tried LSD) those experiences can be quite pleasant, sometimes even euphoric.

Illusions, on the other hand, are only experienced in response to real sensory stimulus, and for the most part in predictable ways that are the same across nearly all humans and even some animal species. Illusions can be shared experiences. Two people looking at the Rotating Snake illusion will experience the same illusory motion.

So how do we know that illusions are not actually faithful reflections of an underlying reality? After all, the main reason to believe in reality at all is that it's a shared experience. Everyone agrees that they can see cars and trees and other people, and the best explanation for that is that there really are cars and trees and people in point of actual physical (maybe even metaphysical) fact. So why do we not draw the same conclusion when everyone sees movement when looking at Rotating Snakes?

I have already pointed out that you can print the Moving Snakes image on a sheet of paper and it will still appear to move when you look at it. That is powerful evidence that the motion is an illusion, but it's not proof. How can we be sure that there aren't certain patterns that, when printed on paper, actually move by some unknown mechanism? Maybe the Moving Snakes image actually causes ink to move around on a sheet of paper somehow. It's not a completely outrageous suggestion. After all, we know that printing very specific patterns on a silicon chip can make it behave in very complicated ways. How can you be sure that paper doesn't have the same ability?

The full argument that Moving Snakes is an illusion is actually quite complicated when you expand it out in full detail. You have to get deep into the weeds of why silicon can be made to do things that ink and wood pulp can't. But the bottom line is that we have a pretty good idea of how silicon works, and we have a pretty good idea of how paper and ink work, and if it turns out that paper can ink could be made to do anything even remotely like what silicon can do it would be Big News, and since there hasn't been any Big News about this, the best explanation of the perceived motion is that it's an illusion.

Things are very different when it comes to consciousness. Consciousness is also a universal human perception, just like the motion in Moving Snakes, but the suggestion that it might be an illusion and not an actual reflection of reality is obviously far less of a slam-dunk than the idea that Moving Snakes is an illusion. In fact, most people when first presented with the idea dismiss it as absurd and unworthy of further consideration. For starters, we have a good understanding of silicon and paper, but we don't have a good understanding (yet) of human brains (Dennett's book notwithstanding). We are nowhere near being able to definitively being able to rule out definitively the possibility that our perception of consciousness is a faithful reflection of some underlying reality that we just don't understand yet.

Another argument against consciousness being an illusion is that it is private. All humans, and possibly some animals, experience it, but each of us only has direct experience of our own consciousness. We cannot directly experience anyone else's.

There is also an argument from first principles that consciousness cannot be an illusion in the same way that Rotating Snakes is: optical illusions are false perceptions, but they are still perceptions. In order to have a perception at all, whether that perception is a faithful reflection of an underlying reality or not, there has to be something real out there to do the perceiving. Consciousness in some sense is that "thing that does the perceiving", or at least it is a perception of the thing-that-does-the-perceiving (whatever that might be) but in any case the idea that consciousness is an illusion is self-defeating: if our perception of consciousness were not a faithful reflection of some underlying reality, we could not perceive it because there would not be any real thing capable of perceiving (the illusion of) consciousness. To quote Joe Provenzano: if consciousness is an illusion, who (or what) is being illused?

I will eventually get around to answering these questions, which will consist mainly of my summarizing Dennett's book so if you want a sneak preview you can just go read it. Fair warning though: it is a scholarly work, and not particularly easy to follow, so you might just want to wait. But if you're feeling ambitious, or merely curious, by all means go tot he source.

In the meantime, rest in peace, Daniel Dennett. Your words had a profound impact on me. I hope that mine may some day do the same for a younger generation.