Bayesian statistics for environmentalists

See also previously: Environmentalists like admixture analysis too (until they don’t)

Let’s talk a little Bayesian statistics and how it applies to model selection in the race and IQ debate. A number of environmentalists have made arguments like that by Centerwall 1978:

It is only 1.5 pages, so we quote in full:

On standard measures of IQ the average black usually scores approximately one standard deviation below the average white (e.g., Scarf et al., 1977). There is an ongoing controversy as to whether this is due primarily to environmental or genetic differences between the two populations I. Scarr et al. (1977) state that “if genetic, racial differences do contribute to average intellectual differences between blacks and whites, then those blacks with higher degrees of white ancestry should perform better on intellectual tasks than those with lesser degrees of admixture.” They then design a study wherein the degree of white ancestry of black children, as determined by gene markers, is compared to their performance on intellectual tasks.

Would finding a positive correlation lend convincing support to notions of white genetic superiority? No, for the study design of Scarr et al. rests upon a fundamental, untestable assumption. They assume that, in terms of intellectual function, those whites who contributed to black ancestry were a random sample of all whites. A genetic interpretation of a positive correlation assumes that miscegenous whites had genetic IQs superior to blacks because they were a representative sample of the general white population–which, in turn, assumes that whites have superior genetic IQs. Thus, to conclude from a positive correlation that whites have superior genetic IQs, it is necessary to assume that whites have superior genetic IQs.

Suppose miscegenous whites were not a random collection of whites. If their IQs were similar to blacks–and lower than other whites–there should be no genetic correlation between degree of white ancestry and intellectual skills. There- fore, any positive correlation would be ipso facto attributable to environmental forces. Would this necessarily mean that miscegenous whites had less genetic mental endowment than other whites? Of course not. Environmental theory assumes that group differences in IQ are due to environmental forces. Thus, if it assumed that blacks have lower apparent IQs due to environmental forces, it can as well be assumed that miscegenous whites also had lower apparent IQs for the same reasons, e.g., social stigma, poor schooling, poverty.

Unfortunately, these arguments cut both ways. If no correlation is found between degree of white ancestry and intellectual skills (Scarr et al., 1977), it would be tempting to infer that there are no general black-white differences in genetic IQ. However, a lack of correlation only demonstrated that there was no significant difference in genetic IQ between ancestral blacks and ancestral miscegenous whites. To complete the syllogism, it is necessary to demonstrate that there was no significant difference in IQ between ancestral miscegenous whites and other whites, or, if there was, to demonstrate that the difference was due to environmental rather than genetic causes. Since most of the principals are dead and the historical data almost non-existent, neither demonstration is possible.

In attempting to resolve the genes-vs.-environment controversy, Scarf et al. have designed a study where any result can be explained by either hypothesis. From an ethical and social viewpoint, their findings were most fortunate. However, resting as they do on an untestable assumption, any inferences are scientifically invalid. The same will hold for any future studies of the same design.

His reasoning is only sort of correct. Restating it we can see why. The mating that produced African Americans (blacks) might have been selective, of course it was to some degree. If below average IQ whites mated with blacks, the European ancestry in African Americans will be below average of that in pure whites (=100% European ancestry). If the selection is really strong and negative such that the whites who mated with blacks were 15 additive heritable IQ (so, maybe 30 phenotypic!) below the other whites, then there will be no correlation between ancestry and IQ among blacks. However, if it was less than this, there will be a positive slope, and more than this, a negative slope. Thus, the expected slope depends on the selection strength into race mixing. Thus, in order to reach the conclusion he reaches, one has to assume this selection strength was extremely strong of ~30 phenotypic IQ, and just so happens (ad hoc) to balance out lower African ancestry genotypic IQ. OK, it could happen but it is very unlikely. One can reason the same way for positive selection among whites, and also among blacks. Thus, by playing around with the selection coefficients into mixed race matings one can produce any expectation of the ancestry slope for predicting IQ. Does this mean that estimating the real world slope is uninformative? No, of course not.

In a Bayesian fashion, we would imagine a set of models that differ in these parameters: the genetic IQ gap, black selective mating, white selective mating, and then derive their predictions about the ancestry slope to predict IQ. If we plot the distribution of these predictions, we will find that the genetic models predict larger and positive slopes than the non-genetic models (i.e. ones with genotypic IQ gap 0). And if we take the best guess models — the ones that use the most plausible assumptions about selective mating — we will find that they predict quite different predictions about the ancestry slope. Then we can examine the real life slopes. It has been done in our two published studies (Kirkegaard et al 2019, Lasker et al 2019) so far, and both of them produced slope estimates in the strong negative range for African ancestry (-1.58 in PING, -0.80 in PNC) that would be expected from any plausible genetic theory with a sizable genetic gap (e.g., >5 IQ). If we then think about model selection in a Bayesian way, we adjust upwards the posterior probability of the models that predicted values in that range. Very few models with low genetic gaps predict such values because one has to make extreme assumptions about the selection of persons into race mixed matings in the past. Thus, if one actually did the math in a simulation, and I guess we should since this reasoning is not obvious to many, one will find that indeed the genetic models are assigned relatively higher posteriors than the environmental models after we observe a large negative slope.

In the above, I ignored the possibility of a causal pathway from ancestry to skin color to discrimination to lower IQ. This has sometimes been advocated, and also results in an ancestry slope that is similar to that one predicted by genetic models. This counter-model is called colorism, and has been discussed a lot over the years by hereditarians. Of course, it also results in a prediction that if we do a regression to predict IQ, and we include both ancestry and skin color, then ancestry shall not have much or any predictive validity, while skin color should. In fact, we found the exact opposite in the Lasker et al study that did this. One can also look at siblings because among siblings, skin color is not linked strongly to ancestry because siblings don’t differ much in ancestry, but still differ notably in skin color. Colorism predicts that in this situation, skin color predicts IQ, and genetic models predict that it doesn’t. Which is true? We already published a study on this too (Hu et al 2019) and we found no validity of skin color among siblings. This finding is in need of replication, which we are looking into already.

It should be noted that Centerwall is quite wrong to say that these facts about selection into race mixed matings in the past is an “untestable assumption”. For a direct approach, one can examine census data to determine the level of selection. But if we want a genetic approach, we have that option as well. The prediction is specifically that the European ancestry found in blacks will be below average polygenic score compared to that found in pure whites. One can of course test this by estimating local ancestry among blacks the same way that 23andme does, and then calculating separate polygenic scores for the African and European parts of the genome. This will supply us with an estimate of the selection into the ancestral matings that produced modern blacks. I think this would be highly informative to do, so it is on our to-do list for future studies.

Finally, is this some point that we had not already considered? Not even that. In Lasker et al discussion section, we wrote:

At least one researcher has advanced the possibility that admixture designs are hopelessly confounded due to historical assortative mating [112]. This theory supposes that the European ancestors of African-Americans were genetically disposed toward either lower or higher cognitive ability in a systematic way. Since Piffer [111] found that other African descent groups have similarly low eduPGS, this explanation would necessitate that the eduPGS are not indexing genetic differences related to cognitive ability in African groups. Determining if this is the case is important for future research. We can directly investigate this hypothesis and further test the transethnic validity of the eduPGS with local admixture mapping. If the theory implied here is correct, European ancestry segments should have systematically higher or lower eduPGS in African-American samples. Relatedly, if eduPGS lack transethnic validity in Africans, they will not be associated with specific regions of the brain, biological variables, or consistent patterns of gene expression as they are in Europeans.
The [112] reference is to Centerwall’s obscure paper. In fact, it only has 2 citations, one of which is ours! ETA: Scarr et al also replied to it back in 1979, but didn’t cite it, so doesn’t show up on GS.

Leave a Reply