### Logical Uncertainty and Logical Induction

#### Posted by Qiaochu Yuan

Quick - what’s the $10^{100}$th digit of $\pi$?

If you’re anything like me, you have some uncertainty about the answer to this question. In fact, your uncertainty probably takes the following form: you assign a subjective probability of about $\frac{1}{10}$ to this digit being any one of the possible values $0, 1, 2, \dots 9$. This is despite the fact that

- the normality of $\pi$ in base $10$ is a wide open problem, and
- even if it weren’t, nothing random is happening; the $10^{100}$th digit of $\pi$ is a particular digit, not a randomly selected one, and it being a particular value is a mathematical fact which is either true or false.

If you’re bothered by this state of affairs, you could try to resolve it by computing the $10^{100}$th digit of $\pi$, but as far as I know nobody has the computational resources to do this in a reasonable amount of time.

Because of this lack of computational resources, among other things, you and I aren’t **logically omniscient**; we don’t have access to all of the logical consequences of our beliefs. The kind of uncertainty we have about mathematical questions that are too difficult for us to settle one way or another right this moment is **logical uncertainty**, and standard accounts of how to have uncertain beliefs (for example, assign probabilities and update them using Bayes’ theorem) don’t capture it.

Nevertheless, somehow mathematicians manage to have lots of beliefs about how likely mathematical conjectures such as the Riemann hypothesis are to be true, and even about simpler but still difficult mathematical questions such as how likely some very large complicated number $N$ is to be prime (a reasonable guess, before we’ve done any divisibility tests, is about $\frac{1}{\ln N}$ by the prime number theorem). In some contexts we have even more sophisticated guesses like the Cohen-Lenstra heuristics for assigning probabilities to mathematical statements such as “the class number of such-and-such complicated number field has $p$-part equal to so-and-so.”

In general, what criteria might we use to judge an assignment of probabilities to mathematical statements as reasonable or unreasonable? Given some criteria, how easy is it to find a way to assign probabilities to mathematical statements that actually satisfies them? These fundamental questions are the subject of the following paper:

Scott Garrabrant, Tsvi Benson-Tilsen, Andrew Critch, Nate Soares, and Jessica Taylor, Logical Induction. ArXiv:1609.03543.

Loosely speaking, in this paper the authors

- describe a criterion called
**logical induction**that an assignment of probabilities to mathematical statements could satisfy, - show that logical induction implies many other desirable criteria, some of which have previously appeared in the literature, and
- prove that a computable logical inductor (an algorithm producing probability assignments satisfying logical induction) exists.

Logical induction is a weak “no Dutch book” condition; the idea is that a logical inductor makes bets about which statements are true or false, and does so in a way that doesn’t lose it too much money over time.

## A warmup

Before describing logical induction, let me describe a different and more naive criterion you could ask for, but in fact don’t want to ask for because it’s too strong. Let $\varphi \mapsto \mathbb{P}(\varphi)$ be an assignment of probabilities to statements in some first-order language; for example, we might want to assign probabilities to statements in the language of Peano arithmetic (PA), conditioned on the axioms of PA being true (which means having probability $1$). Say that such an assignment $\varphi \mapsto \mathbb{P}(\varphi)$ is **coherent** if

- $\mathbb{P}(\top) = 1$.
- If $\varphi_1$ is equivalent to $\varphi_2$, then $\mathbb{P}(\varphi_1) = \mathbb{P}(\varphi_2)$.
- $\mathbb{P}(\varphi_1) = \mathbb{P}(\varphi_1 \wedge \varphi_2) + \mathbb{P}(\varphi_1 \wedge \neg \varphi_2)$.

These axioms together imply various other natural-looking conditions; for example, setting $\varphi_1 = \top$ in the third axiom, we get that $\mathbb{P}(\varphi_2) + \mathbb{P}(\neg \varphi_2) = 1$. Various other axiomatizations of coherence are possible.

Theorem:A probability assignment such that $\mathbb{P}(\varphi) = 1$ for all statements $\varphi$ in a first-order theory $T$ is coherent iff there is a probability measure on models of $T$ such that $\mathbb{P}(\varphi)$ is the probability that $\varphi$ is true in a random model.

This theorem is a logical counterpart of the Riesz-Markov-Kakutani representation theorem relating probability distributions to linear functionals on spaces of functions; I believe it is due to Gaifman.

For example, if $T$ is PA, then the sort of uncertainty that a coherent probability assignment conditioned on PA captures is uncertainty about which of the various first-order models of PA is the “true” natural numbers. However, coherent probability assignments are still logically omniscient: syntactically, every provable statement is assigned probability $1$ because they’re all equivalent to $\top$, and semantically, provable statements are true in every model. In particular, coherence is too strong to capture uncertainty about the digits of $\pi$.

Coherent probability assignments can update over time whenever they learn that some statement is true which they haven’t assigned probability $1$ to; for example, if you start by believing PA and then come to also believe that PA is consistent, then conditioning on that belief will cause your probability distribution over models to exclude models of PA where PA is inconsistent. But this doesn’t capture the kind of updating a non-logically omniscient reasoner like you or me actually does, where our beliefs about mathematics can change solely because we’ve thought a bit longer and proven some statements that we didn’t previously know (for example, about the values of more and more digits of $\pi$).

## Logical induction

The framework of logical induction is for describing the above kind of updating, based solely on proving more statements. It takes as input a **deductive process** which is slowly producing proofs of statements over time (for example, of theorems in PA), and assigns probabilities to statements that haven’t been proven yet. Remarkably, it’s able to do this in a way that eventually outpaces the deductive process, assigning high probabilities to true statements long before they are proven (see Theorem 4.2.1).

So how does logical induction work? The coherence axioms above can be justified by Dutch book arguments, following Ramsey and de Finetti, which loosely say that a bookie can’t offer a coherent reasoner a bet about mathematical statements which they will take but which is in fact guaranteed to lose them money. But this is much too strong a requirement for a reasoner who is not logically omniscient. The logical induction criterion is a weaker version of this condition; we only require that an efficiently computable bookie can’t make arbitrarily large amounts of money by betting with a logical inductor about mathematical statements unless it’s willing to take on arbitrarily large amounts of risk (see Definition 3.0.1).

This turns out to be a surprisingly useful condition to require, loosely speaking because it corresponds to being able to “notice patterns” in mathematical statements even if we can’t prove anything about them yet. A logical inductor has to be able to notice patterns that could otherwise be used by an efficiently computable bookie to exploit the inductor; for example, a logical inductor eventually assigns probability about $\frac{1}{10}$ to claims that a very large digit of $\pi$ has a particular value, intuitively because otherwise a bookie could continue to bet with the logical inductor about more and more digits of $\pi$, making money each time (see Theorem 4.4.2).

Logical induction has many other desirable properties, some of which are described in this blog post. One of the more remarkable properties is that because logical inductors are computable, they can reason about themselves, and hence assign probabilities to statements about the probabilities they assign. Despite the possibility of running into self-referential paradoxes, logical inductors eventually have accurate beliefs about their own beliefs (see Theorem 4.11.1).

Overall I’m excited about this circle of ideas and hope that they get more attention from the mathematical community. Speaking very speculatively, it would be great if logical induction shed some light on the role of probability in mathematics more generally - for example, in the use of informal probabilistic arguments for or against difficult conjectures. A recent example is Boklan and Conway’s probabilistic arguments in favor of the conjecture that there are no Fermat primes beyond those currently known.

I’ve made several imprecise claims about the contents of the paper above, so please read it to get the precise claims!

## Re: Logical Uncertainty and Logical Induction

You stopped just before the fun bit! How does their logical inductor work?