Maarten Maartensz:    Philosophical Dictionary | Filosofisch Woordenboek                      

 B - Bayes' Theorem


Bayes' Theorem: This has many formulations, but essentially comes down to the observation that one can learn from experience using probability theory while avoiding the fallacy of affirming the consequent. This all is based in the end on the following theorem of elementary probability theory: p(T|P)=p(P|T)*p(T):p(P).

In this equation, the factors have standard names:

p(T|P) is the posterior probability (of the theory T given the data P)
p(P|T) is the likelihood (of the data P given the theory T)
p(T) is the prior probability (of the theory T)
p(P) is the probability of the data

1. Usefulness of Bayes' Theorem: The strength, interest and usefulness of Bayes' Theorem can be explained by noting that if T is a theory and P a prediction of the theory, one can recalculate the probability of T given that P is true (or false) if one has the probability of P given T (which is one's theory T from which one has derived that P has a certain probability if T is true) together with the probability of one's theory T and the probability of one's prediction P.

A classical example involves finding the orbit of a comet from a few observations.

Suppose one has a theory T about that orbit that may have a low initial probability p(T) (since there are many possible orbits), and a prediction p(P) that the comit will be at a certain place at a certain time of which the probability also will be low apart from T (since the comet may in fact be at many possible places), although p(P|T) i.e. that the comit will be at a certain place at a certain time if the theory is true as a rule will be high (since else one wouldn't propose the theory).

Now it follows by Bayes' Theorem i.e. the above elementary formula of probability theory that p(T|P) i.e. the probability of the theory T if the prediction is true will be much higher than p(T) was before the prediction was verified, and indeed in the ratio p(P|T):p(P). Thus, if p(P|T) was 90/100 and p(P) was 1/100, then p(T|P)=90*p(T), which may make the new probability p(T|P) appreciable even if p(T) may have been quite low to start with (say also 1/100, e.g. because its prediction P is that low). Thus, the new p(T|P)=90/100, whereas the old pr(T), before finding that P is true, was 1/100.

2. Problem of Bayes' Theorem: The main problem involved in Bayes' Theorem is that it often is not clear how one can establish the three probabilities one requires to use it, namely p(P|T), p(P) and p(T).

This is especially so with p(T), in that one often can make plausible cases for p(P|T) (it must be high if the explanation is to be useful) and p(P) (there often can be given evidence that if T is not true, then P is not probable at all), but since theories cannot be counted like blueberries or particular instances of kinds of fact, there often seems to be no plausible way to fix the probability of a theory.

There are several ways to circumvent the problem, but the usual ones (such as so-called likelihood-ratios: p(P|T):p(P|~T) or p(T):p(~T)) all seem to involve a considerable element of subjectivity: In the end, it all comes down to one's subjective degree of belief in T.

For those who believe that probability is subjective, this is no objection, and indeed believers in subjective probability feel quite free in using Bayes' Theorem, while also some have converted to a subjective interpretation of probability theory precisely because it permits one to apply Bayes' Theorem.

The problem with this, apart from other objections to subjective intepretations probability theory, is that in practice it won't help much, for example with fanatics.

Take Darwin's theory of evolution. This accounts quite well for may otherwise problematic facts, and has quite a few succesful predictions to its credit that do not follow from other theories, such as divine providence. Thus it can be seen as being quite well confirmed by the evidence and by Bayesian reasoning - unless one is both a believer in the subjective interpretation of probability and in divine providence, and therefore fixes the probability of the Darwinian theory as so small (say, in the order of 10-1000) that no practical amount of evidence can much change this (except with verified predictions of the same order of improbability).

3. A possible solution: One possible solution is to make a special assumption about the probability of a theory. This follows, after a definition of a term that occurs in the assumption to be made: The proper consequences of a theory T are those statements that follow from T but do not follow from ~T.

Now one may assume the following Theoretical Probability Postulate or TPP

  • TPP: The probability x of a theory T at any time t is the probability of the least probable proper consequence that is known to follow from T at time t.

The justification for this assumption is that we certainly know that pr(T) cannot be higher than the probability of its least probable proper consequence, for that follows from probability theory, whereas the stated conventional assumption answers the problem how to attribute a probability to a theory, and indeed does uniquely so, and with empirical justification, namely that least probable proper consequence of the theory.

Thus our assumption for the probability of a theory T at time t is that it is the maximum of what it may be at t, given the probabilities of the known proper consequences of T. This is an assumption; it is consistent with probability theory; it is based on the known facts about what T entails; and it is a convention.


See also: Bayesian Conditionalization, Defeasible reasoning, Probability Theory, Personal Probability, Problem of induction, Rules of Probabilistic Reasoning

Literature: Ayer, Hume, Goodman, Howson & Urbach, Maartensz, Rescher, Russell, Skyrms, Stegmüller,

 Original: Mar 25, 2005                                                Last edited: 24 September 2014.   Top