This post is a second look at *The Definability of Truth in Probabilistic Logic*, a preprint by Paul Christiano and other Machine Intelligence Research Institute associates, which I first read and took notes on a little over one year ago.

In particular, I explore relationships between Christiano et al’s probabilistic logic and stumbling blocks for self-reference in classical logic, like the liar’s paradox (“This sentence is false”) and in particular Löb’s theorem.

The original motivation for the ideas in this post was an attempt to prove a probabilistic version of Löb’s theorem to analyze the truth-teller sentences (“This sentence is [probably] true”) of probabilistic logic, an idea that came out of some discussions at a MIRIx workshop that I hosted in Seattle.

# The Reflection Principle

The reflection principle in Christiano’s paper states that one can add a “probability” symbol to a first-order theory sufficiently sophisticated to admit Gödel numbering of its sententes in a way that mostly agrees with some *actual* probability measure on sentences. That is, for every rational and every sentence (which may include ), the Reflection principle asserts

- If , then .
- If , then .

The motivating idea is that can “see” its own values up to arbitrary (but finite) precision, and in particular can see when the probability of some sentence is in a given open set. Christiano et al show that such a can be constructed (using the high-powered Kakutani Fixed Point Theorem), additionally satisfying the following *coherence conditions* which guarantee that agrees with logical deduction the way we should expect:

- If the background theory proves , then .
- If proves , then .
- .

**The value of the probabilistic reflection principle is that it establishes a meaningful way for a logical system to reason about truth without choking on the liar’s paradox.**

The reflection is *meaningful*, because contraposition gives the corresponding schema for *closed* intervals guaranteeing that the things which the system believes about itself are *actually true*:

- If , then .
- If , then .

Regarding the liar’s paradox, the main example in Christiano et al’s paper is the probabilistic liar sentence which says, informally, that “ is probably false”. The reflection principle then goes right ahead and agrees that yes, is probably false, but then stops short of concluding that this actually makes true.

More explicitly, for some fixed rational , should say that . The reflection principle knows enough to decide that must be equal , and this *does* contradict , but only *slightly*; simply can’t see its own values with infinite precision, and so does not update further. By contrast, if tried to assign *lower* probability than to , then *would* see that this actually makes true, and would have to update its estimate for *upwards* (actually all the way to ). The argument that can’t exceed is symmetric, and is the only “stable” possibility.

# Löb’s theorem and the classical limits of self-trust

One example not mentioned in the Christiano paper are truth-teller sentences, i.e. which say . Application of the reflection principle can quickly rule out probabilities for such other than , , and , but can’t obviously narrow things down any more. Unlike the liar sentences, both extreme values and seem to attract rather than repel the value of , and so it seems possible that some truth-teller sentences could be “true”, some could be “false”, and some could be “in the middle”.

This is unsatisfying, because in classical logic, Löb’s theorem implies that a certain type of truth-teller sentence must actually be true (the Good News). On the other hand, Löb’s theorem also implies that trusting yourself is provably impossible (the Very Bad News).

In this section, I want to talk about Löb’s theorem, so that in the next, I can talk about an attempt to prove an analogue of it for probabilistic logic to resolve the issue of truth-teller sentences. The attempt failed for obvious-in-retrospect reasons, but there are striking parallels that helped me understand both Löb’s theorem and the reflection principle better.

I like to think of Löb’s theorem as a generalization of the liar paradox sometimes called Löb’s paradox, or Curry’s paradox, or the Santa Claus paradox. The informal English version is a sentence which says “if is true, then Santa Claus exists”. One then tries to prove by assuming the hypothesis (that is true) and working toward the consequent (that Santa Claus exists). Well… if were true, then ‘s hypothesis would be satisfied, and it would follow that Santa Claus exists. So now we’ve established that the consequent follows from the hypothesis, which means we’ve proved . Thus is true, and Santa Claus exists.

Of course the argument can be replicated for any proposition you’d like to derive.

Another (possibly clearer) way to phrase the informal sentence is “ is false or Santa Claus exists”. Then cannot be false for standard liar paradox reasons, so must be true, and hence Santa Claus must exist.

To set the stage for the *actual* Löb’s theorem, let me say that the resolution of the liar’s paradox in classical (non-probabilistic) logic is

- The “truth predicate” for the standard model is simply not definable (Tarski).
- By replacing
*truth*with*provability*, the modified liar sentence asserting its own*unprovability*is*true*in the standard model, but*unprovable*(Gödel, first incompleteness theorem).

However, self-trust is impossible. This can be seen in Gödel’s *second* incompleteness theorem, which says that a “reasonable” theory of arithmetic can only prove itself consistent… in the trivial case that it is *in*consistent and just proves everything. (Here “reasonable” means that its set of axioms can be enumerated by a computer program.)

The second incompleteness theorem is a special the special case of Löb’s theorem with taken to be (standing for an obvious contraction, like ). Using the standard abbreviation for Gödel’s provability predicate:

Theorem 1 (Löb)For a “reasonable” theory of arithmetic, if , then .

This means that the theory can’t trust its own proofs! It can only know that a proof of would actually imply … in the trivial case that just proves .

This seems really bad, with the small silver lining that truth-teller sentences asserting their own provability *must* be provable. This fact was is actually used in an interesting way in *Robust Cooperation in the Prisoner’s Dilemma*, my notes here.

One way to prove Löb’s theorem is to formalize the line of reasoning sketched in the Santa Claus paradox (using instead of “is true”), which ends up needing to switch out one instance of for . Or in other words, the only thing that *stops* the Santa Claus argument from proving any you choose is the inability of to trust that a proof of would actually imply .

Löb’s proof relies on three properties of known as the Hilbert-Bernays provability conditions:

- If , then (If a proof actually exists in the real world, then it’s a finite object that can see and check).
- (There is an easy recipe that knows for turning a proof of into a proof of . Namely, feed any proof of into the proof-checker and watch).
- (So knows how to take a proof of and a proof of , concatenate them, and apply Modus Ponens to get a proof of ).

The Hilbert-Bernays conditions provide a nice high-level “interface” for reasoning about the operator without mucking around too much in the low-level details of its implementation. In the next section, I’ll argue that analogues of Hilbert-Bernays hold in the probabilistic setting for the “has high probability” predicate, yet Löb’s argument fails to go through.

Finally, one point of notation that I’ll insist on is writing implications of the form in “curried” style as . Here the arrow associates right, i.e. this is an abbreviation for .

*Proof of Löb’s Theorem.* Suppose that , and consider a sentence for which proves . Then can prove as follows.

- (by definition of and HB1)
- (by 1 and HB3)
- (by HB2)
- (by 2 and 3)
- (by hypothesis and 4)
- (by definition of and 5)
- (by HB1 and 6)
- (by definition of , 6, and 7)

Note the similarity in lines 5 and 6 to the informal argument that “if this sentence is true, then Santa Claus exists” is true.

Finally, because it can be a bit confusing to keep track of where everything is happening (there are levels and meta-levels) and what is allowed, I’ll include one of my favorite exercises in provability logic which stresses these distinctions. Determine which of the following sentences are true (in the standard model of arithmetic) for all sentences , and which are provable (in Peano Arithmetic) for all sentences . (The operator should also be taken relative to PA.)

Solutions:

- is clearly false and hence unprovable, as the consistency of PA (namely ) is true but not provable by the incompleteness theorem.
- is is true (PA only proves true theorems), but not generally provable thanks to Löb’s theorem.
- is provable (and true). It’s the 2nd Hilbert-Bernays condition, which is admittedly the trickiest one to justify without making an appeal to authority.
- is a special case of 2, so is true, but still not provable (take to be and apply Löb’s theorem).

# Probabilistic analogues

If the truth-teller sentences of classical logic are those with , and the truth-teller sentences of probabilistic logic (that we’re interested in anyway) are those with , then we might want to adapt the notation to the new context to talk about “has high probability”.

I started using the notation

to mean that . I’m using the semantic “models” symbol because the right hand side is something that’s actually true about in the real world. Similar notation can be established for other comparators , , and so forth. Boolean combinations of such expressions make sense as well, e.g.

means that either or . Notation that does not make sense is with no in front.

Finally, it’s also worth overloading this notation for the formal logical symbol . Thus the truth-teller sentences we are most interested in are those with .

Then boxes can be nested and the resulting sentences can be satisfied by , e.g.

would mean . So outermost boxes here stand for in the real world, and all inner boxes (which may be nested arbitrarily deeply) stand for the logical symbol .

We will be careful to always prefix judgments including this new notation with or to disambiguate the overloading.

With this notation established, the “high probability” versions of the reflection principle (and contrapositive) can be compactly expressed as

And now that we’ve gotten ourselves organized, analogues of the three Hilbert-Bernays conditions for “has high probability” emerge:

- If , then (this is one of the coherence conditions)
- (this is reflection)
- (union bound)

Actually the version of (HB3) that will come up in a Löbian argument is the slightly more complicated

which is also true by a “union bound” type of argument.

While the Hilbert-Bernays conditions were enough to prove Löb’s theorem, Christiano’s probabilistic logic also has a strong anti-Löb principle in the contrapositive reflection schema

i.e. *can always* trust its own judgments, at least about closed conditions like .

This means that either there’s a mistake in the proof of the consistency of the reflection principle, or that a generalization of Löb’s theorem will fail for subtle reasons (else the above type of schema should always let us prove ). Since I’ve checked the proof of consistency, it must be the latter.

So what goes wrong in trying to prove that Santa Claus exists (at least with high probability)? Fix any sentence , and let be a sentence with .

- By definition of and (HB1), we have
- Before trying to distribute the outermost , note by (HB2), we have
- Now by the lines above and the specialized (HB3) with , , and , we have
- By the self-trust principle for closed conditions,

At this point, it looks quite a bit like we’ve derived . We’ve shown that its *contents* hold of , so we might even write , since is equivalent to a boolean combination of sentences that start with . But this does’t mean we can conclude , or even the weaker condition (since is a *closed*, rather than *open*, condition on probabilities, so (HB2) does not apply), or even the much weaker condition that .

In fact, if , we *must* have , even though it looks like we derived .

This is a bullet that the reflection principle has to bite, but it’s not as bad as it looks. Keep in mind that basically just says “either is false or Santa Claus exists”. Just as the reflection principle is okay with saying that “this sentence is [probably] false” is [probably] false, and then not updating further, it has to be okay with proving the contents of this Löb sentence, but not being able to recognize this because doing so requires infinite precision.

Now one might try to massage the argument a little bit, say with such that or different combinations of and . Such attempts will fail, because the use of (HB2) in step 2 requires an open condition on the part, but then continuing after step 4 requires all of to be an open condition, which requires a closed condition on ! (Contrast this with the classical case where the provability has no openness or closedness, and (HB2) always applies.)

One can change between closed and open conditions with minimal loss of precision, e.g. imples , which in turn implies , but Löb’s trick seems extremely sensitive to such to such losses, to the point that it failed to go through above because didn’t have the *infinite* precision to see that it had derived exactly what the Löb sentence said.

The failure of the Löb argument to get farther than step 4 above is subtle and confused me for longer that I care to admit, but in the presence of the “anti-Löb” principle of the form it makes quite a lot of sence. There may yet be other ways to narrow down the possible probabilities of truth-teller sentences in Christiano’s logic, but a fully general Löb’s theorem seems not to be such a way.