Seriously folks, trust the equations

This is a quick one. I just saw a prominent Bayesian on twitter make the claim, “He’s confused. Double use of data also wrong from B perspective. Update happens only once.” Rather than argue endlessly about philosophy let’s examine the equations and see what they say.

What disaster befalls you if you’re foolish enough to double update the data?

(1)   \begin{eqnarray*} P(A | B, B) &=& \frac{P(B, B|A)}{P(B,B)}P(A) \\ &=&  \frac{P(B | B, A)P(B|A)}{P(B|B)P(B)}P(A) \\ &=& \frac{P(B|A)}{P(B)}P(A) \\ &=& P(A|B) \end{eqnarray*}

Since P(A|B,B)=P(A|B) you should be able to update twice, or 1000 times, and get the right answer. (**)

It’s not even true that you can’t use the data “to suggest a hypothesis”. Suppose you’re trying to assess some model from data, but aren’t sure which hypothesis H_k to base it on. Then a straightforward application of the equations of probability theory gives,

(2)   \begin{equation*} P(Model|Data) = \sum_kP(Model|Data, H_k)P(H_k|Data) \end{equation*}

The term P(H_k|Data) looks a hell of lot like using the data to “suggest the hypothesis”.

So the equations say you can do these things. If you do them and consistently get wrong answers, then you’re screwing something else up. And that was precisely my point in regards to preregistration: it’s not the double use of data that’s causing the problem, its something else. The whole preregistration movement wants to eliminate a non-problem with gimmicks, while ignoring that “something else”.

This is not necessarily a disaster since it may not do real harm. But it’s a step sideways, not forward.

If you’re interested in finding that “something else”, you might consider that before the rise of Frequentist Statistics (~1920-1960), none of the great scientists of old felt the need for preregistration gimmicks.

(**) Incidentally, this is a wide ranging information processing property in Bayesian Statistics. Maximizing the Entropy by the same constraint twice similarly has no affect on the resulting Maxent distribution.