You are currently viewing p > .05 thus no effect thus familial confounding fallacies

p > .05 thus no effect thus familial confounding fallacies

Reading over some sibling control studies, BIG TOUGH GUY mentioned this quite wrong study:

The topic of the study need not interest us, the error is statistical. Results table:

So we see:

  • Hazard ratio of 1.33 among unrelated people with a confidence interval of 1.23 to 1.45.
  • Sibling control study finds 1.16 ratio with confidence interval of 0.64 to 2.09.

Authors:

The Danish results strongly suggest that unmeasured genetic factors or shared familial environment are likely to account for the increased risk of schizophrenia among offspring exposed to maternal smoking. Although smoking during pregnancy is known to be harmful in many ways (e.g., low birth weight and infant mortality), and pregnant women should still be encouraged to stop smoking, maternal smoking during pregnancy seems not to be an independent risk factor for schizophrenia.

The conclusion is correct, but the inference is incorrect. These results for siblings don’t support anything of the sort. The confidence interval is huge. The sibling results are consistent with a causal effect being 3 times larger than that seen between unrelated persons. This is of course very implausible. The error is that the errors conclude from the lack of p < .05 among siblings to attenuation and thus confounding. The key metric to actually calculate here is the ratio of effect sizes, i.e., 0.33 / 0.16 (about 50%). If the authors had done this, they would have found that this ratio has a huge confidence interval that overlaps 100% (no difference in effect sizes), and thus there is no evidence of attenuation here. Their analysis is woefully underpowered for this analysis, and basically uninformative.

Second example can be found in this study:

Methods
We linked national registers and conducted a prospective cohort study of singleton Swedish-born men to explore associations between maternal pregnancy diabetes and educational achievement at age 16 years, the age of completing compulsory education in Sweden (n = 391,545 men from 337,174 families, graduating in 1988–1997 and n = 326,033 men from 282,079 families, graduating in 1998–2009), and intelligence quotient (IQ) at the mandatory conscription examination at 18 years of age (n = 664,871 from 543,203 families).
Results
Among non-siblings, maternal diabetes in pregnancy was associated with lower offspring cognitive ability even after adjustment for maternal age at birth, parity, education, early-pregnancy BMI, offspring birth year, gestational age and birthweight. For example, in non-siblings, the IQ of men whose mothers had diabetes in their pregnancy was on average 1.36 points lower (95% CI −2.12, −0.60) than men whose mothers did not have diabetes. In comparison, we found no such association within sibships (mean difference 1.70; 95% CI −1.80, 5.21).

Conclusions/interpretation
The association between maternal diabetes in pregnancy and offspring cognitive outcomes is likely explained by shared familial characteristics and not by an intrauterine mechanism.

Look at the confidence intervals. Their huge sample size estimates 1.36 points lower (95% CI −2.12, −0.60) among unrelateds, which is already close to 0. Recall IQ is SD=15, so their estimate is 0.09 d. Their sibling analysis finds 1.70; 95% CI −1.80, 5.21). This has p > .05 as it overlaps with 0, but the authors then interpret this as being evidence of attenuation. Again, this is not true. Their results don’t suggest anything because the confidence interval of the sibling study is huge and is even consistent with an intelligence gain of 5 points from smoking during pregnancy! Again, they would have to calculate the ratio of the effect size, i.e., 0.70 / 0.36 (i.e., about 2), and its confidence interval. The authors are claiming this value is below 1 (i.e., attenuation) but their own best estimate is that the value is 2, i.e., suppression effect! If they had computed the confidence interval, they would have seen it is huge and uninformative despite their massive sample size.