Bayes' Theorem Part 1: Belief in ESP

Bayes' Theorem can be used to determining how strongly evidence should affect a given belief. This methodology is explained and defended in the book The Logic of Science by E.T. Jaynes (see here). Jaynes argues that Bayes' Theorem can be derived from desirable postulates of plausible reasoning. That is, Bayes' Theorem in probability is simply a special case of a more general form of inductive logic.

In this post, I intend to introduce Bayes' Theorem by showing its application to assessing evidence in an unusual area. This post relates to the debate between William Lane Craig and Bart Ehrman found here. I think that Dr. Craig was technically correct in his critique of "Ehrman's Egregious error", but this fact does not help Dr. Craig's case as much as he would hope. But that discussion will have to wait for another post. My main focus here will assessing the question "Are ESP skeptics guilty of an anti-supernatural bias?"

This post will utilize some math. The ideas expressed here could be given without the mathematics. However, I find that putting the ideas in a mathematical framework forces me to be more disciplined in my thinking. It can make the discussion more concise, and makes my thinking more scrutible. It gives critics a very clear idea of what evidences are moving my beliefs and can sharpen critcism of my position.

Much of this post follows Chapter 5 in Jaynes' book. There is a published report of apparent telepathic events (Soal, S. G., and Bateman, F. Modern Experiments in Telepathy. New Haven: Yale University Press, 1954.) One card guessing experiment was designed to produce a probability of a correct selection of 20%. Mrs. Stewart reportedly predicted the correct card 9410 times out of 37100 trials (she was correct 25.36% of the time). This is over 25 standard deviations away from the expected value. I will try to examine how one could interpret this data in light of Bayes' Theorem.

First some background on Bayes' Theorem. (This description follows the Wikipedia article here). Bayes' Theorem often written as:
P(H | E) = P(E | H) P(H)
  • P(H | E) is the probability/plausibility of the hypothesis H in light of evidence E.
  • P(E | H) is the probability of observing Evidence E given that hypothesis H is true.
  • P(H) is the probability that a Hypothesis is true based upon background assumptions
  • P(E) is the probability of observing the evidence regardless of the truth of the hypothesis

The denominator is often rewritten using an identity is probability,
P(E) = P(E|H1)P(H1) + P(E|H2)P(H2) + . . . + P(E|Hn)P(Hn), where H1, through Hn are mutually exclusive hypothesis. In the case of two hypotheses, ~H is considered the negation of the primary hypothesis, P(~H) = 1 - P(H) and Bayes' Theorem is written as:
P(H | E) =         P(E | H) P(H)        
P(E|H) P(H) + P(E|~H)(1- P(H)) 
How does Bayes' theorem indicate that we should assess the likelihood that Mrs. Stewart has exhibited ESP, assuming that either ESP or random chance are the only explanations? The probability of selecting r correct cards out of n trials is given by the binomial theorem. (The formulas I used to compute the specific values P(Data|HP=p) are given at the end of the post.)

HypothesisDescription P(Data|Hypothesis)
HP=0.2 Mrs. Stuarts' results are a
result of pure chance.
2.003 × 10-139
HP=0.2536 Mrs. Stuarts' is able to predict
cards at a rate of 0.2536 which is
indicative of ESP
4.76 × 10-3

To find how much beliefs move as a result of evidence, it is necessary know where we start prior to the evidence. Suppose I am initially very skeptical of any claim to telepathic power. My a priori estimate that ESP has been genuinely demonstrated is about one in a billion, compared to the likelihood that the observed data is due to random chance. In light of these prior beliefs, I would assign P(HP=0.2536)= 10-9 and P(HP=0.2)= 1 - 10-9. Plugging these values into the formula for Bayes' Theorem gives:
P(HP=0.2536 | E) =         (4.76× 10-3) (10-9)            .
(4.76× 10-3) (10-9) + 2.003× 10-139(1- 10-9)  
= 1.0 to the limits of floating point precision

Thus, with the assumption of two hypothesis, a telepathic event is nearly a certainty. (At least we know that this event is not well explained by chance). Despite these calculation, I personally do not believe that Mrs. Stewart exhibited telepathic powers. If I believed that the two offered hypothesis were the only possibility, perhaps I should be persuaded. However, this use of Bayes' Theorem ignores several possibilities that come to most peoples' minds. There is always the possibility that there was deception somewhere in the reporting chain. The researchers could have failed to report data on days the Mrs. Stewart did not do so well. Mrs. Stewart could have also noticed a reflection that the researches missed, etc.

Let's offer a third hypothesis that includes the possibility of deception by somebody in the reporting chain. I think that deception would explain the data as well as an actual telepathic event. I think reasonable priors are 1 in a billion for ESP, 1 in a thousand for deception someplace with still the most likely possibility that the events can be explained by chance. I will also assume that deception could explain the data as well as the ESP hypothesis.

HypothesisDescription P(Data|Hypothesis)
HDeceptionDeception Occurred SomewhereAs explanatory as ESP:
about 4.76× 10-3

My prior probabilities for ESP are still 1 in a billion. My prior probability that someone in a scholarly article would be the victim of (or purposely perpetuate) a fraud is about 1 in a thousand. My prior probability that the event can be explained by natural processes and not deception makes up the remainer (1 - 10-9 - 10-3). Utilizing three hypothesis changes the denominator to P(E|HP=0.2536) P(H0.2536) + P(E|HDeception) P(HDeception + P(E|HP=0.2) P(H0.2). Again, the last term is so small that underflow will occur, so I will ignore it in the following calculation. With the new hypothesis considered, the probability of ESP is now:
P(HP=0.2536 | E) =         (4.76× 10-3) (10-9)
(4.76× 10-3) (10-9) + 10-3(4.76× 10-3)  
= 9.99999× 10-7 

The probability of ESP is now about 1 in a million. Utilizing Bayes' Theorem with these values indicates that the probability of deception is very large (about 0.999999). When dealing with extemely unlikely event, seemingly unlikely alternatives can have a dramatic effect on the plausibility assessment.

What I have attempted to do here is essentially put the parable of "The Boy who Cried Wolf" into a Bayesian framework. It is possible for someone to tell the truth, not be believed while the people doubting the claim are entirely rational. It gives insight explaining why it is so hard to establish unusual claims on the basis of testimony. That is not to say it is impossible. Note that in my example, the plausibity of ESP moved by a factor of a thousand. However, the various possibilities of deception anywhere along the reporting chain make it very difficult to convince one of a highly unusual claim. It raises the question, "Can ESP be established on the basis of testimony?" I think it can, but the controls would need to be extensive.

This form of reasoning does capture the essence of why I do not believe most ESP claims. It is very reasonable to believe that magicians are not using supernatural powers even when you don't know how they are performing their trick. Is my rejection of Mrs. Stewarts' ESP claim justified here? Am I displaying an unjustified anti-supernatural bias?
To compute my results, I used the following formulas. (In other word, only read this if you want to examine the math.) The probability of selecting exactly r of n on the hypothesis the selection probability is p is given by the equation:
P(D|HP=p) =   n!   pr(1-p)n-r 

The factorials are too large to used directly with standard floating point processing, so using the Stirling approximation for large factorials:
n! ≈ nn e-n sqrt(2 π n)

And the identity
x = exp( ln(x)) where x > 0 

after some algebra yields:
P(D|HP=p) ≈ exp{-0.5 ln(2π)+(0.5+n)ln(n)-(0.5+r)ln(r)...
+ (r-n-0.5) ln(n-r) + r ln(p)+(n-r)ln(1-p)}

This expression is amenable to floating point computations. I utilized a different approximation than Jaynes did. He utilized an "entropy" approximation that gives the same answers as my formulas do. However, the number he used in his equation 5-7 appears wrong, (this difference doesn't affect our conclusions).

Edited Aug 1 to fix various typo's