Sam and Polly and Python

The Sam and Polly Puzzle is a beautiful riddle in which two logicians Sam and Polly must simulate the other’s thought processes several layers deep, deriving information from seemingly unhelpful statements. While many variations have been proposed, I first heard of the puzzle from the excellent list at the XKCD wiki, where it takes the following form:

Sam and Polly are perfectly logical mathematicians. A student walks in and says: “I’m thinking of two numbers `x` and `y`, with `3 <= x <= y <= 97`. I’ll tell their sum to Sam, and their product to Polly.” She does so, then leaves, and the following conversation takes place:

Sam (to Polly): You can’t know what `x` and `y` are.

Polly (to Sam): That was true, but now I know.

Sam (to Polly): Now I know too.

Find `x` and `y`.

If you haven’t solved this riddle or one of its variations before, I absolutely recommend giving it a try before reading on. It took me about three hours to solve, with a whiteboard and the Python programming language.

This post consists of:

• Section 1: A clean Python solution
• Section 2: An investigation of the (somewhat weak) dependence on the bounds 3 and 97
• Section 3: A discussion of unbounded variations of the riddle.

The most common variant on the puzzle (and the original version published by Freudenthal in 1969, available here) uses the following bounds instead: `2 <= x < y` and `x + y <= 100`, leading to a different solution. I’m sticking with the above formulation just because that’s the version that I saw and started thinking about before looking around to see what was known. The solution in the following section can be easily modified to solve Freudenthal’s version.

This riddle also known as the Sum and Product Puzzle, or as the Impossible Puzzle (not to be confused with the Hardest Logic Puzzle Ever, another excellent riddle).

1. Solution

The solution takes the following form.

• Say that Sam approves a sum `s` if Sam can make his first statement (“You can’t know what `x` and `y` are”) upon seeing `s`. This means that for every possible `3 <= x <= y <= 97` summing to `s`, the number `x*y` admits at least one other factorization `a*b` with `3 <= a <= b <= 97`. (In particular, `x` and `y` are not both prime. Since the Goldbach Conjecture is true below 100, `s` must be odd.)
• Say that Polly approves a product `p` if Polly can make her statement about `p`, given that Sam said what he did about the sum. That is, if there’s only one possible decomposition `x*y` of `p` such that Sam approves `x+y`.
• Say that Sam approves a sum `s` again if Sam can make his second statement about `s`. That is, if there’s only one possible decomposition `x+y` of `s` such that Polly approves `x*y`.
• Finally, just generate all possible pairs `3 <= x <= y <= 97`, and filter them by the above three “approval” predicates.

With liberal use of Python’s list comprehensions, the above can be turned into code quite compactly.

The above script prints the singleton list `[(13,16)]` in about 1/10 of a second on my six-year-old laptop.

2. Bounds

After writing the solution in the preceding section, I learned that this version of Sam and Polly was not the only one, and so I wondered the extent to which the bounds 3 and 97 were important to the problem. It turns out the 3 is somewhat important, but the 97 really is not!

We’ll say that the Sam and Polly puzzle with bounds `(L,H)` for `L <= H` is the usual Sam and Polly puzzle with `L` replacing `3` and `H` replacing `97` in the obvious way. Such an instance of the Sam and Polly puzzle may fail to have a unique solution (i.e. a pair `(x,y)` such that Sam and Polly can truthfully say what they do about the sum and product of `x` and `y`), either by having many solutions or none at all.

Straightforward modifications to the code in the previous section allowed me to solve all Sam and Polly puzzles for all pairs of bounds `(L,H)` in the range 0 to 300 inclusive. Going forward, I’ll write such ranges in Haskell‘s syntax, e.g. `[0..300]`. Here’s what I found:

1. Almost all of the unique solutions occur when `L` is 2 or 3.
2. When `L == 2`, there is a unique solution `(4,13)` if `H` is in the range `[62..300]`, and there are no solutions for `H <= 61`. (Thus 2 could have replaced 3 in the statement of the riddle, resulting in a different unique solution. The pair `(4,13)` is also the unique solution to Freudenthal’s original riddle with `2 <= x < y` and `x+y <= 100`.)
3. When `L == 3`, there is a unique solution `(13,16)` if `H` is in the range `[94..300]`, and there are no solutions for `H <= 93`. (So there is nothing particularly special about the 97 in the original riddle: 100 or 300 or 256 would have served just as well.)
4. No solutions at all were found with `L >= 4`.
5. When `L` is either 0 or 1, there is a unique solution `(1,4)` whenever `H` is an element of `[10,11,12,13,22,23,24]`. There are no solutions for `H <= 9`. Every `H` in `[14..21]` or `[25..300]` has at least two solutions (and up to twelve), always including `(1,4)`.
6. Actually, for every `H` in `[1..300]`, the solution set for the bound pairs `(0,H)` and `(1,H)` are identical. The number of solutions here as a function of `H` is roughly increasing over the range `[1..300]`, but not monotonically:

Created in matplotlib

So it’s not generally the case that increasing the upper bound `H` will never decrease the number of solutions. However, the property of “being a solution” only involves quantification over bounded ranges, and so is “absolute” for large `H` in the sense that for fixed `L`, for every pair `(x,y)`, there is a number `M` such that if `(x,y)` is a solution for some `H >= M`, then `(x,y)` is a solution for every `H >= M`. (And this happens for some `M` if and only `(x,y)` is a solution to the riddle with lower bound `L` and no upper bound at all.)

Then points (2) and (3) above suggest the following two conjectures (the first is true, and the second is false):

• Conjecture 1 (true): When `L == 2`, the pair `(4,13)` is a solution for all `H >= 62`; when `L == 3`, the pair `(13,16)` is a solution for all `H >= 94`.
• Conjecture 2 (false): When `L == 2`, the pair `(4,13)` is the unique solution for all `H >= 62`; when `L == 3`, the pair `(13,16)` is the unique solution for all `H >= 94`. The upper bounds can be removed from the Sam and Polly puzzle.

Conjecture 2 is disproved by running a few more numbers. It turns out that:

• When `L == 2`, the solution `(4,13)` is unique for `H` in `[62..865]`. For `H` in `[866..1000]`, there is an additional solution `(4,61)`.
• When `L == 3`, the solution `(13,16)` is unique for `H` in `[94..957]`. For `H` in `[958..1000]`, there is an additional solution `(16,73)`.

On the other hand, Conjecture 1 can be verified algorithmically, thanks to our observation that all the quantification involved in determining whether a pair `(x,y)` is a solution for `(L,H)` is bounded in terms of `y` (i.e. with bound not depending on `H`). The bound is quartic in `y`.1

To verify Conjecture 1, we replace the dictionaries `SUMS` and `PRODS` with functions `sums` and `prods` which calculate all the ways a number can be written as a sum or product respectively, and these functions also keep track of the largest term or factor ever considered in the computation. This is the job of the global variable `M` in the code below.

That is, when `H` is at least (the final value of) `M`, further increasing it (or removing the upper bound entirely) does not change whether or not a pair `(x,y)` is a solution.

Here is the relevant code, with the definition of `divisors` elided (`divisors(n)` should be a list of all numbers in `[1..n]` dividing `n`).

The script is then run on the command line, e.g.

``````python test_unbdd_solt.py 3 13 16
``````

prints

``````True
M: 444
``````

This means that for `L == 3`, the pair `(13,16)` is a solution whenever `H >= 444`. The bound can of course be decreased to 94, because we’ve already checked that `(13,16)` is a solution for `H` in `[94..443]`.

Similarly with command line arguments `2 4 13` the script prints

``````True
M: 171
``````
``` ```

which tells us that for `L == 2`, `(4,13)` is a solution whenever `H >= 171` (and hence whenever `h >= 62` by previous work).

This computation proves Conjecture 1, and it also gives us a tool for strongly disproving the uniqueness in Conjecture 2: we can also show that the pairs `(4,61)` and `(16,73)` (for `L == 2` or `L == 3` respectively) are also solutions for all sufficiently large `H`.

Sufficiently large means `H >= 34980` in the first case, and `H >= 36410` in the second. This is our quartic bound: Sam and Polly end up considering fairly large numbers, even when `x` and `y` are relatively small.

It's probably true that the bounds on `H` can be brought down to 865 and 957 respectively (where the solutions `(4,61)` and `(16,73)` first appear), but the (naive?) Sam and Polly solver can't realistically check all `H` values into 5 digits.

Finally, I was able to find and check several more such solutions.

• When `L == 2`, any pair in the following list is a solution for all large `H`: `[(4,13), (4,61), (4,229), (16,73), (16,111), (32,131), (64,127)].` Checking the pair `(4,229)` requires quantification up to about 5.8 million!
• When `L == 3`, any pair in the following list is a solution for all large `H`: `[(13,16), (16,73), (16,133), (64,127)]`. Two of these pairs also belong to the other list.

I don't know whether there are infinitely many such solutions for either `L`.

Most of the above solution pairs consist of an odd prime and a power of 2 (111 and 133 are not prime); before reading the next section, it may be worthwhile to think about why this is often the case.

3. Sam and Polly Unbounded

As we saw in the previous section, the Sam and Polly riddle without an upper bound (and with lower bound 2 or 3) has multiple solutions.

The following unbounded variation on the riddle apparently does admit a unique solution:

Sam and Polly are perfectly logical mathematicians. A student walks in and says: "I'm thinking of two numbers `x` and `y`, with `2 <= x <= y`. I'll tell their sum to Sam, and their product to Polly." She does so, then leaves, and the following conversation takes place:

Sam (to Polly): You can't know what `x` and `y` are.

Polly (to Sam): That didn't help; I still don't know `x` and `y`.

Sam (to Polly): Now I know `x` and `y`.

Find `x` and `y`. You may assume the Goldbach Conjecture.

This version is attributed to Barry Wolk of the University of Manitoba, and is mentioned in passing (without solution) in Martin Gardner's 1980 Mathematical Games column in Scientific American 242 (here, but paywalled).

It is a simple manner to modify the script `test_unbdd_solt.py` to check whether a particular pair `(x,y)` solves Wolk's version of Sam and Polly: the dedicated minimalist will simply change `==` to `<` in the definition of `polly_approves`!

Searching all pairs `(x,y)` in the range `[2..100]` finds exactly one solution `(5,6)`. Then the difficult part of Wolk's puzzle is proving this solution to be unique.

I was unable to do this in a week or so of on-and-off effort. Nor could I find a proof on the internet: Gardner does not give one, and the claimed "solution" at this French website actually only checks pairs with sum up to 59 (i.e. no attempt is made to prove uniqueness).

What I was able to do was contrive a variation on Wolk's riddle admitting an easier proof of uniqueness.

The only change I made to Wolk's version was to weaken Sam's original statement to

"Unless one of the numbers is 2, you can't know what `x` and `y` are.

Another trivial modification to our script checks that `(3,6)` (sic) is the only solution in the range `[2..100]`. For uniqueness, we prove:

Claim. There is no solution to the modified Wolk riddle when the sum `s = x+y` exceeds 11.

This claim implies uniqueness, as we've already checked all `(x,y)` with sum up to 102.

Proof of Claim: Given the setup, Sam can make his (modified) first statement if and only if the sum `s` cannot be written as a sum of two odd primes. This is in turn equivalent to `s` being either odd or 4. (Proof: if `s` is not a sum of two primes, then `s` is odd by Goldbach's Conjecture. If `s` is a sum of two primes but not a sum of two odd primes, then `s` is either odd or 4. Conversely, neither 4 nor any odd number is a sum of two odd primes.)

Now given that Sam makes his first statement, Polly being unable to determine `x` and `y` implies `s` is not 4, hence is odd. This implies her product `p` is even.

Polly can make her statement iff there are at least two distinct pairs `(a,b)` with `a >= 2` even, `b >= 2` odd, and `a*b == p`. This happens iff `p` has at least two odd prime factors (counted according to multiplicity), i.e. `p` is not `q*2**k` for some odd prime `q` and some `k >= 1`.

Thus Sam can make his second statement iff there's a unique pair `(c,d)` with `c >= 2` even, `d >= 2` odd such that `c*d` is not `q*2**k` for some odd prime `q` and some `k >= 1` (note that if there were such a decomposition, we'd necessarily have `c == 2**k` and `d == q`). We claim that `s` must be `<= 11` for this to happen.

If `s >= 12`, consider the three pairs `(3,s-3)`, `(5,s-5)`, and `(7,s-7)`. Note since `s-7 > 4`, that at most one of `s-7`, `s-5`, and `s-3` can be a power of two. Thus at least two of the pairs have products not of the form `q*2**k` for odd prime `q` and `k >= 1`.

This completes the solution to the modified Wolk riddle.

In the original Wolk riddle, Sam's first statement does not admit such an easy characterization as "the sum is odd or 4"; the simplest characterization seems to be "the sum is odd and is not 2 more than a prime".

This difficulty percolates through the problem. At the end of the above proof, we concluded that large sums `s` have at least two decompositions `c+d` with `c*d` ambiguous to Polly. With the original version of Sam's first statement, "ambiguous to Polly" doesn't just mean not `q*2**k` for odd prime `q`; it means there are at least two decompositions `a*b` of `c*d` with `a+b` not merely odd but also not 2 more than a prime.

While the primes have 0 asymptotic density (so that most odd numbers are not 2 more than a prime), I don't know how to actually guarantee that at least two decompositions `a*b` of a number have `a+b` not 2 more than a prime.

Perhaps there is a counting argument that I was not clever enough to see.

At the very least, reasoning in terms of primality allowed me to write a faster solution searcher for Wolk's riddle by rewriting the naive superlinear `sam_approves` with a polylogarithmic primality test (the particular test I used is Miller-Rabin below 4 trillion). A little more efficiency is squeezed out by:

• Memoizing (pure) functions which will be called several times with the same arguments. (In idiomatic Python, this is done with function decorators.)
• Checking length conditions lazily with generators. For example, we only need to compute the first two elements of a list to determine that it has length greater than 1.
• Rewriting everything in C. Just kidding, I wasn't really interested in doing that!

I ran the following script for a few hours, verifying that there are no solutions other than `(5,6)` in the range `[2..5000]`. So at least now I'm confident that Wolk's riddle is correct, even if I can't solve it.

Again, I've elided the definitions of `isprime` and `divisors`.

Notes

1. Independent of `H`, checking `sam_approves(s)` involves quantification up to `O(s**2)`. Checking `polly_approves(p)` involves checking `sam_approves(s)` for `s` up to `O(p)`, i.e. involves quantification up to `O(p**2)`. Finally checking `sam_approves_again(s)` involves checking `polly_approves(p)` for `p` up to `O(s**2)`, hence involves quantification up to `O(s**4)`, i.e. up to `O(y**4)`.