Twin control studies really are evidence of causation: reply to JayMan

I found some genetically informative studies on the benefits of marriage:

These studies control for genetic and shared environmental confounding in various ways and generally find some benefits of marriage on crime reduction and mental health/well-being. The benefits from non-controlled studies are found to be exagerated, of course, because they don’t control for the omnipresent genetic confounding. I posted one study on Twitter:

JayMan (J) doesn’t agree. In private, he argued that twin control studies don’t rule out all confounding. This is true, they fail to rule out a very tiny amount of genetic confounding — MZ twins are not exactly genetically identical, but they are very close. Furthermore, there is the possibility of non-shared non-genetic confounding.

I wrote:

Given that non-shared non-genetic variance is noise, one can indeed infer causation. Within MZ associations are strong evidence of causation.

(Infer here was perhaps too strong a phrasing. I meant it probabilistically.)

In J’s words:

No, because it’s not *all* noise (obviously so in the case of sexual orientation in discordant twins). Some of it is developmental variation. Some of it is the result of pathogens/other environmental insults.

However, he is incorrect:

Evidence means that the posterior probability is larger than the prior. In this case, a within MZ association rules out confounding due to A and C pathways. This is important because A confounding is probably the largest source of confounding, hence ruling it out increases the probability of all remaining options including causal connection. This being a longitudinal study (with a control too) also rules out reverse causation, further increasing the probability of forward causation.

Your argument is ignoring the probability change. To generalize and illustrate: one cannot declare something not evidence just because it does not rule out all possible alternative interpretations. If we know that x must be one of 1, 2, … 10. Ruling out that it is 1-8 is strong evidence that it is 9, even if it is still possible that it is 10. Assuming equiprobable options, the probability increases by a factor of 5 (from 10% to 50%).

To give a concrete but simplified example. Suppose that whenever we find an association between two human variables like these, 60% of the time it is due to genetic confounding, 20% of the time it is due to shared environmental confounding, 10% of the time it is due to non-shared non-genetic confounding (this includes the developmental variation that J mentions) and 10% of the time it is causal. So, the prior probability of true causality is only 10%. However, if we then find that this relationship holds when we control for genetic and shared environmental confounding, the posterior probability of causality is now 50%. This is because only non-shared non-genetic confounding and true causality remains as possible options, both with 10% prior probability, and thus with 50% of the posterior. Thus, this represents a 5x increase in the probability. By one common Bayesian standard, this represents strong evidence.

Back in reality, a given link between two variables will be some mix of variance pathways (e.g. 50% genetic, 30% shared environmental confounding, 20% causal), not only a single. This does not change anything substantial about the results, only makes it more complicated. (Proof of this is left to the reader!)

Views All Time
Views All Time
Views Today
Views Today