Tuesday, May 20, 2008

When Anton Chigurh comes to town you'd best have your lucky quarter. There could be a lot riding on it when he asks you to call it, and you don't want it to come up tails, friend-o.

You check your pocket. Yep, it's still in there. But wait - there are two quarters in your pocket! You look at them. They are indistinguishable, but you know that one is your lucky quarter that always comes up heads. What do you do?

Obviously, you toss them both. If one comes up tails, the other must be the lucky one. If both come up heads, try again. (If both come up tails, you'd better get out of town, fast!) How many tosses are you likely to have to make?

There are two outcomes: both heads and one heads, one tails. If the second coin is a `fair' coin, then each outcome is equally likely. So with one toss, you ought to have a 50-50 chance of finding out which quarter is `normal'. If you don't, then you ought to have a 50-50 chance of finding out on the *next* flip. Each flip cuts the chances of remaining ignorant by 1/2.

So how many tosses are you likely to have to make? Intuitively, it seems that you shouldn't need more than a few tosses to determine which coin is the lucky one, but it also seems that you might need at least a couple. We can rephrase the question: Do you think that you'll need a million tosses? That's absurd. How about a thousand? Still absurd. How about twenty? Hmmm... 2^20 ~= 1,000,000. One chance in a million? Won't happen. Ten? 2^10 ~= 1,000. Maybe...

We can maps tosses to log-odds (if we use 10log10, we'll get the log-odds in decibels!)

N-tosses     Odds of            
            Getting tails    Log Odds in dB
            at least once 
    0            1/1       0
    1            2/1       3.010299956639812
    2            4/1       6.020599913279623
    3            8/1       9.030899869919434
    4            16/1      12.04119982655925
    5            32/1      15.05149978319906
    6            64/1      18.06179973983887
    7            128/1     21.07209969647868
    8            256/1     24.08239965311849
    9            512/1     27.09269960975831
    10           1024/1    30.10299956639812
    11           2048/1    33.11329952303793
    12           ...       36.12359947967774
    13           ...       39.13389943631755
    14                     42.14419939295736
    15                     45.15449934959717
    16                     48.16479930623699
    17                     51.1750992628768 
    18                     54.18539921951661
    19                     57.19569917615642
    20                     60.20599913279624
.
Why are we tossing both coins? We already know that one coin is going to come up heads every time we toss it, so we don't get any new information. We can just select a coin and start tossing. If it ever comes up tails, the other coin is the one we want, but if it continues to come up heads, it must be the two headed coin. So let's just toss one coin. This changes things slightly. After our first toss, we have three possible outcomes. If it comes up tails, then we know we chose the fair coin. If it comes up heads, we could have chosen the fair coin and gotten lucky, or we could have chosen the unfair coin. The probabilities are this:

        p(Unfair) = 1/2
        p(Fair & Heads) = 1/4
        p(Fair & Tails) = 1/4
.
If it comes up tails, we know we have the wrong coin, but if it comes up heads, what is the probability that we have the right coin? I.e. what is
          p(Unfair|heads)
.
This can be computed with Bayes rule:

                                        p(heads|Unfair) 
        p(Unfair|heads) =  p(Unfair) * ------------------ 
                                           p(heads)      

        p(Unfair) = 1/2
        p(heads|Unfair) = 1
        p(heads) = 1/2 + 1/4
                                      1
        so p(Unfair|heads) = 1/2 * -------- = 1/2 * 4/3 = 2/3
                                     3/4
.
And if we toss it again, the probabilities are these:

        p(Unfair) = 1/2                (unfair is always heads)
        p(Fair & Heads & Heads) = 1/8
        p(Fair & Heads & Tails) = 1/8
        p(Fair & Tails) = 1/4          (we stop flipping!)

.
and we want to compute p(Unfair | heads & heads)


                                                  p(heads & heads|Unfair)
        p(Unfair|heads & heads) =  p(Unfair) * --------------------------- 
                                                   p(heads & heads)      

        p(Unfair) = 1/2
        p(heads & heads | Unfair) = 1
        p(heads & heads) = 1/2 + 1/8 = 5/8


                                      1
        so p(Unfair|heads) = 1/2 * -------- = 1/2 * 8/5 = 4/5
                                     5/8
.
We can continue in this vein as long as we want, but the calculations become *much* easier if we work in log-odds:

  N-tosses    p(Unfair| n-heads)   O(Unfair|n-heads)  log(O(Unfair|n-heads))
     0              1/2                 1/1               0
     1              2/3                 2/1           3.010299956639812
     2              4/5                 4/1           6.020599913279623
     3              8/9                 8/1           9.030899869919434
     4             16/17               16/1          12.04119982655925
     5             32/33               32/1          15.05149978319906
     6             64/65               64/1          18.06179973983887
     7            128/129             128/1          21.07209969647868
     8            256/257             256/1          24.08239965311849
     9            512/513             512/1          27.09269960975831
    10           1024/1025           1024/1          30.10299956639812
.
Each time we toss the coin and it comes up heads, we become more sure that the coin is the unfair one. We can quantify the improvement in the odds:

                       p(heads|Unfair)
         O(Unfair) * ------------------ = O(Unfair | heads)
                       p(heads|Fair)      
.
The odds of it being unfair, given that we just tossed `heads' is the right hand side, the left hand side is the product of the odds of it being unfair *before* we tossed it and the ratio of probabilities of heads depending on the selection of the coin. Rewriting this:


         O(Unfair|heads)      p(heads|Unfair)     2
        ----------------  = ------------------ = ---
          O(Unfair)           p(heads|Fair)       1
.
So every time we flip the coin and get heads, we improve the odds of having the unfair coin by 2:1. In log-odds space, multiplication becomes addition, so flipping the coin and getting heads adds 3.01 to our log odds estimate of the unfairness.

The cool thing about formulating the problem in this way is that you have separated the probability estimation into a number of independent steps. No matter how you arrive at an estimate of the odds, you can flip the coin and add 3.01 to your estimate. Here is an example: suppose you dropped your lucky quarter in a jar full of regular quarters. To find your lucky quarter again, you flip *all* the quarters and set aside the ones that come up tails. If the jar held 128 quarters, about 64 would come up tails and you could filter them out. Another flip discounts 32 more, then 16, etc. It would take you 7 flips just to get back to the point where you had two coins and weren't sure which one was the lucky one. Now the log of 1/128 is -21.07 Each flip adds 3.01 to your log odds estimate, so 7 flips adds 21.07, leaving you at 0 -- 50/50 odds.