This post is a second look at The Definability of Truth in Probabilistic Logic, a preprint by Paul Christiano and other Machine Intelligence Research Institute associates, which I first read and took notes on a little over one year ago.
In particular, I explore relationships between Christiano et al’s probabilistic logic and stumbling blocks for self-reference in classical logic, like the liar’s paradox (“This sentence is false”) and in particular Löb’s theorem.
The original motivation for the ideas in this post was an attempt to prove a probabilistic version of Löb’s theorem to analyze the truth-teller sentences (“This sentence is [probably] true”) of probabilistic logic, an idea that came out of some discussions at a MIRIx workshop that I hosted in Seattle.
The Reflection Principle
The reflection principle in Christiano’s paper states that one can add a “probability” symbol to a first-order theory sufficiently sophisticated to admit Gödel numbering of its sententes in a way that mostly agrees with some actual probability measure
on sentences. That is, for every rational
and every sentence
(which may include
), the Reflection principle asserts
- If
, then
.
- If
, then
.
The motivating idea is that can “see” its own values up to arbitrary (but finite) precision, and in particular can see when the probability of some sentence
is in a given open set. Christiano et al show that such a
can be constructed (using the high-powered Kakutani Fixed Point Theorem), additionally satisfying the following coherence conditions which guarantee that
agrees with logical deduction the way we should expect:
- If the background theory
proves
, then
.
- If
proves
, then
.
.
The value of the probabilistic reflection principle is that it establishes a meaningful way for a logical system to reason about truth without choking on the liar’s paradox.
The reflection is meaningful, because contraposition gives the corresponding schema for closed intervals guaranteeing that the things which the system believes about itself are actually true:
- If
, then
.
- If
, then
.
Regarding the liar’s paradox, the main example in Christiano et al’s paper is the probabilistic liar sentence which says, informally, that “
is probably false”. The reflection principle then goes right ahead and agrees that yes,
is probably false, but then stops short of concluding that this actually makes
true.
More explicitly, for some fixed rational ,
should say that
. The reflection principle knows enough to decide that
must be equal
, and this does contradict
, but only slightly;
simply can’t see its own values with infinite precision, and so does not update further. By contrast, if
tried to assign lower probability than
to
, then
would see that this actually makes
true, and would have to update its estimate for
upwards (actually all the way to
). The argument that
can’t exceed
is symmetric, and
is the only “stable” possibility.
Löb’s theorem and the classical limits of self-trust
One example not mentioned in the Christiano paper are truth-teller sentences, i.e. which say
. Application of the reflection principle can quickly rule out probabilities for such
other than
,
, and
, but can’t obviously narrow things down any more. Unlike the liar sentences, both extreme values
and
seem to attract rather than repel the value of
, and so it seems possible that some truth-teller sentences could be “true”, some could be “false”, and some could be “in the middle”.
This is unsatisfying, because in classical logic, Löb’s theorem implies that a certain type of truth-teller sentence must actually be true (the Good News). On the other hand, Löb’s theorem also implies that trusting yourself is provably impossible (the Very Bad News).
In this section, I want to talk about Löb’s theorem, so that in the next, I can talk about an attempt to prove an analogue of it for probabilistic logic to resolve the issue of truth-teller sentences. The attempt failed for obvious-in-retrospect reasons, but there are striking parallels that helped me understand both Löb’s theorem and the reflection principle better.
I like to think of Löb’s theorem as a generalization of the liar paradox sometimes called Löb’s paradox, or Curry’s paradox, or the Santa Claus paradox. The informal English version is a sentence which says “if
is true, then Santa Claus exists”. One then tries to prove
by assuming the hypothesis (that
is true) and working toward the consequent (that Santa Claus exists). Well… if
were true, then
‘s hypothesis would be satisfied, and it would follow that Santa Claus exists. So now we’ve established that the consequent follows from the hypothesis, which means we’ve proved
. Thus
is true, and Santa Claus exists.
Of course the argument can be replicated for any proposition you’d like to derive.
Another (possibly clearer) way to phrase the informal sentence is “
is false or Santa Claus exists”. Then
cannot be false for standard liar paradox reasons, so
must be true, and hence Santa Claus must exist.
To set the stage for the actual Löb’s theorem, let me say that the resolution of the liar’s paradox in classical (non-probabilistic) logic is
- The “truth predicate” for the standard model is simply not definable (Tarski).
- By replacing truth with provability, the modified liar sentence asserting its own unprovability is true in the standard model, but unprovable (Gödel, first incompleteness theorem).
However, self-trust is impossible. This can be seen in Gödel’s second incompleteness theorem, which says that a “reasonable” theory of arithmetic can only prove itself consistent… in the trivial case that it is inconsistent and just proves everything. (Here “reasonable” means that its set of axioms can be enumerated by a computer program.)
The second incompleteness theorem is a special the special case of Löb’s theorem with taken to be
(standing for an obvious contraction, like
). Using the standard abbreviation
for Gödel’s provability predicate:
Theorem 1 (Löb) For a “reasonable” theory
of arithmetic, if
, then
.
This means that the theory can’t trust its own proofs! It can only know that a proof of
would actually imply
… in the trivial case that
just proves
.
This seems really bad, with the small silver lining that truth-teller sentences asserting their own provability must be provable. This fact was is actually used in an interesting way in Robust Cooperation in the Prisoner’s Dilemma, my notes here.
One way to prove Löb’s theorem is to formalize the line of reasoning sketched in the Santa Claus paradox (using instead of “is true”), which ends up needing to switch out one instance of
for
. Or in other words, the only thing that stops the Santa Claus argument from proving any
you choose is the inability of
to trust that a proof of
would actually imply
.
Löb’s proof relies on three properties of known as the Hilbert-Bernays provability conditions:
- If
, then
(If a proof actually exists in the real world, then it’s a finite object that
can see and check).
(There is an easy recipe that
knows for turning a proof of
into a proof of
. Namely, feed any proof of
into the proof-checker and watch).
(So
knows how to take a proof of
and a proof of
, concatenate them, and apply Modus Ponens to get a proof of
).
The Hilbert-Bernays conditions provide a nice high-level “interface” for reasoning about the operator without mucking around too much in the low-level details of its implementation. In the next section, I’ll argue that analogues of Hilbert-Bernays hold in the probabilistic setting for the “has high probability” predicate, yet Löb’s argument fails to go through.
Finally, one point of notation that I’ll insist on is writing implications of the form in “curried” style as
. Here the arrow associates right, i.e. this is an abbreviation for
.
Proof of Löb’s Theorem. Suppose that , and consider a sentence
for which
proves
. Then
can prove
as follows.
(by definition of
and HB1)
(by 1 and HB3)
(by HB2)
(by 2 and 3)
(by hypothesis
and 4)
(by definition of
and 5)
(by HB1 and 6)
(by definition of
, 6, and 7)
Note the similarity in lines 5 and 6 to the informal argument that “if this sentence is true, then Santa Claus exists” is true.
Finally, because it can be a bit confusing to keep track of where everything is happening (there are levels and meta-levels) and what is allowed, I’ll include one of my favorite exercises in provability logic which stresses these distinctions. Determine which of the following sentences are true (in the standard model of arithmetic) for all sentences , and which are provable (in Peano Arithmetic) for all sentences
. (The
operator should also be taken relative to PA.)
Solutions:
- is clearly false and hence unprovable, as the consistency of PA (namely
) is true but not provable by the incompleteness theorem.
- is is true (PA only proves true theorems), but not generally provable thanks to Löb’s theorem.
- is provable (and true). It’s the 2nd Hilbert-Bernays condition, which is admittedly the trickiest one to justify without making an appeal to authority.
- is a special case of 2, so is true, but still not provable (take
to be
and apply Löb’s theorem).
Probabilistic analogues
If the truth-teller sentences of classical logic are those with
, and the truth-teller sentences of probabilistic logic (that we’re interested in anyway) are those
with
, then we might want to adapt the
notation to the new context to talk about “has high probability”.
I started using the notation
to mean that . I’m using the semantic “models” symbol
because the right hand side is something that’s actually true about
in the real world. Similar notation can be established for other comparators
,
, and so forth. Boolean combinations of such expressions make sense as well, e.g.
means that either or
. Notation that does not make sense is
with no
in front.
Finally, it’s also worth overloading this notation for the formal logical symbol
. Thus the truth-teller sentences
we are most interested in are those with
.
Then boxes can be nested and the resulting sentences can be satisfied by , e.g.
would mean . So outermost boxes here stand for
in the real world, and all inner boxes (which may be nested arbitrarily deeply) stand for the logical symbol
.
We will be careful to always prefix judgments including this new notation with
or
to disambiguate the overloading.
With this notation established, the “high probability” versions of the reflection principle (and contrapositive) can be compactly expressed as
And now that we’ve gotten ourselves organized, analogues of the three Hilbert-Bernays conditions for “has high probability” emerge:
- If
, then
(this is one of the coherence conditions)
(this is reflection)
(union bound)
Actually the version of (HB3) that will come up in a Löbian argument is the slightly more complicated
which is also true by a “union bound” type of argument.
While the Hilbert-Bernays conditions were enough to prove Löb’s theorem, Christiano’s probabilistic logic also has a strong anti-Löb principle in the contrapositive reflection schema
i.e. can always trust its own judgments, at least about closed conditions like
.
This means that either there’s a mistake in the proof of the consistency of the reflection principle, or that a generalization of Löb’s theorem will fail for subtle reasons (else the above type of schema should always let us prove
). Since I’ve checked the proof of consistency, it must be the latter.
So what goes wrong in trying to prove that Santa Claus exists (at least with high probability)? Fix any sentence , and let
be a sentence with
.
- By definition of
and (HB1), we have
- Before trying to distribute the outermost
, note by (HB2), we have
- Now by the lines above and the specialized (HB3) with
,
, and
, we have
- By the self-trust principle for closed conditions,
At this point, it looks quite a bit like we’ve derived . We’ve shown that its contents hold of
, so we might even write
, since
is equivalent to a boolean combination of sentences that start with
. But this does’t mean we can conclude
, or even the weaker condition
(since
is a closed, rather than open, condition on probabilities, so (HB2) does not apply), or even the much weaker condition that
.
In fact, if , we must have
, even though it looks like we derived
.
This is a bullet that the reflection principle has to bite, but it’s not as bad as it looks. Keep in mind that basically just says “either
is false or Santa Claus exists”. Just as the reflection principle is okay with saying that “this sentence is [probably] false” is [probably] false, and then not updating further, it has to be okay with proving the contents of this Löb sentence, but not being able to recognize this because doing so requires infinite precision.
Now one might try to massage the argument a little bit, say with such that
or different combinations of
and
. Such attempts will fail, because the use of (HB2) in step 2 requires an open condition on the
part, but then continuing after step 4 requires all of
to be an open condition, which requires a closed condition on
! (Contrast this with the classical case where the provability
has no openness or closedness, and (HB2) always applies.)
One can change between closed and open conditions with minimal loss of precision, e.g. imples
, which in turn implies
, but Löb’s trick seems extremely sensitive to such to such losses, to the point that it failed to go through above because
didn’t have the infinite precision to see that it had derived exactly what the Löb sentence said.
The failure of the Löb argument to get farther than step 4 above is subtle and confused me for longer that I care to admit, but in the presence of the “anti-Löb” principle of the form it makes quite a lot of sence. There may yet be other ways to narrow down the possible probabilities of truth-teller sentences in Christiano’s logic, but a fully general Löb’s theorem seems not to be such a way.