You are currently viewing Heritability, within and between groups

Heritability, within and between groups

There is a new preprint that’s making the rounds on Twitter:

Without the ability to control or randomize environments (or genotypes), it is difficult to determine the degree to which observed phenotypic difference between two groups of individuals are due to genetic vs. environmental differences. However, some have suggested that these concerns may be limited to pathological cases, and methods have appeared that seem to give—directly or indirectly—some support to claims that aggregate heritable variation within groups can be related to heritable variation among groups. We consider three families of approaches: the “between-group heritability” sometimes invoked in behavior genetics, the statistic PST used in empirical work in evolutionary quantitative genetics, and methods based on variation in ancestry in an admixed population, used in anthropological and statistical genetics. In each case, we show mathematically that information on within-group genetic and phenotypic information in the aggregate cannot separate among-group differences into genetic and environmental components, and we provide simulation results that support our claims. We discuss these results in terms of the long-running debate on this topic.

It’s a kind of round-about reply to some of our recent work, none of which is cited. They do, however, cite the older Jensen and Rushton work, and they do cite Warne’s review paper and book. The chief aim is to throw doubt on the methods to estimate between group heritability and the metric itself. There is some sense to this because between group heritability is a weird metric. It can be larger than 100%. To see how, imagine that you have two populations that differ in their genetic potential for some phenotype. Given identical environments and the lack of gene-environment interactions, the phenotype gap will be entirely due to genetic factors, thus, 100% between group heritability. Consider a simple scenario. You breed two variants of a plant species for height. After a while, you have created a genetic difference of say 5 cm on average. Now if you put the tall plants in a worse environment for growth, it will be somewhat less tall, say 3 cm, shorter than it would be given identical environments. Thus, the genetic difference is 5 cm, but the phenotypic difference is only 3 cm. So between group heritability would be more than 100%. Since humans tend to construct their own environments — active gene-environment correlation — some of their environmental differences can also be genetic in origin. If these are nonetheless equalized, the total genetic effect will be larger than the phenotypic difference. As such, the between group heritability will be greater than 100%, which is odd. This is arguably the case with regards to Africans living in Africa (poor environment) and elsewhere (better environments built by non-Africans). When Africans live in Africa, their IQ mean is about 70, but when they live in Western countries, it is about 80 IQ for pure Africans (100% African genetic ancestry). As such, the 10 IQ gap closing is due to environmental factors being relatively equalized. However, the difference in these environmental causes are themselves caused by the the genetics of Europeans, and as such, should be considered genetic effects as per active gene-environment correlation. In this case, depending on our terminology, we might say that the total genetic gap is 30 IQ (including the active gene-environment causation) or 20 IQ if we are interested only in the direct genetic effect. The math gets more quirky in situations where the average effects of genetics and environment go in different directions, for instance, where the phenotypic gap becomes 0, but the genetic gap is 5 cm, and the environmental one -5 cm. For reasons like this, this metric though superficially attractive is not terribly informative in practice, even confusing. It is in fact rarely used. The main point in bringing it up occasionally is to show that Eric Turkheimer was forgetful or lying when he said that there isn’t any real quantitative theory of group differences (2017):

For all the hereditarians’ idle intuitions about differences being part genetic and part environmental, where is the empirical or quantitative theory that describes how this apportioning is supposed to work? There is no such thing as a “group heritability coefficient,” no way to put any meat on the speculative bones about partial genetic determination.

The authors of the new paper carry out some simulations to show that the concepts are problematic. Based on their results, they want to move forward with their stronger claim, namely:

Here, we show that in contrast to DeFries’ suggestion, the rationale underlying Lewontin’s thought experiments is general and poses challenges for the interpretation of h2B [between group h²) whenever environmental differences among groups are not controlled or understood. In particular, we show that, in line with Lewontin’s intuition, the heritabilities of phenotypic differences between groups are not constrained by heritability within groups: perfect knowledge of within-group heritability provides no information about between-group heritability. Crucially, even if the heritability of between-group differences is estimated correctly, it leaves the direction of the genetic and environmental components of phenotypic difference unclear.

It is an example of the deductivists’s fallacy, which I named back in 2018:

Per this, I shall coin the fallacy made by Kempthorne, which I shall term the deductivist’s fallacy. It is when a critic looks at a relationship, finds that he cannot think of any strict formal i.e. necessary/logically necessary relationship between the premises and the conclusion, and then concludes that the premises can tell us nothing about the conclusion. This ignores the fact that the premises may have non-deductive/probabilistic relevance. To put it another way, the deductivist’s fallacy is when one criticizes an inductive argument for not being deductive enough. Many of the traditional fallacies in informal logic can be given a Bayesian reading this way and would no longer be fallacies. For instance, appeal to authority is surely relevant because statements about a topic from experts is more likely to be true than one from non-experts — unless it’s social science! There’s a few papers on this topic such as Korb 2004. Neven Sesardic also has a good discussion of the between/within group heritability reasoning in his excellent book Making sense of heritability.

The authors of the new piece reiterate the same thing but with more detail. Knowing only within group heritabilities and relying only on deductive reasoning, nothing can be inferred about the genetic causation of group gaps. That’s the same thing famous communist geneticist Richard Lewontin started promoting in the 1970s and which we still listen to (“the master argument”). They are no doubt right with the math. Jensen, of course, was aware of that, so he never said that one could make such a deductive inference. It’s the eternal strawman, that Sesardić spends a long time discussing in his 2005 book (Making Sense of Heritability, 10/10 read this book!). The hereditarian point is, however, that within group heritabilities, along with other information, are probabilistically or inductively informative about genetic causation of group gaps. All science is of course based on empirical results, and thus conclusions are always a matter of probability. Any empirical finding might be false since all of this rests on the mathematics of sampling. It’s possible, however unlikely, that a meta-analysis of 100 large non-p-hacked studies still produces an overall estimate far from the truth. This is not so interesting in itself, though. Taking seriously the inductive (and abductive) nature of science, there are a great many facts which together make it a “not unreasonable hypothesis” to posit genetic causation of race differences in intelligence and other phenotypes. That is what Jensen wrote in his 1969 paper and the conclusion has not changed in direction, but rather gotten stronger with time:

The fact that a reasonable hypothesis has not been rigorously proved does not mean that it should be summarily dismissed. It only means that we need more appropriate research for putting it to the test. I believe such definitive research is entirely possible but has not yet been done. So all we are left with are various lines of evidence, no one of which is definitive alone, but which, viewed all together, make it a not unreasonable hypothesis that genetic factors are strongly implicated in the average Negro-white intelligence difference. The preponderance of the evidence is, in my opinion, less consistent with a strictly environmental hypothesis than with a genetic hypothesis, which, of course, does not exclude the influence of environment or its interaction with genetic factors.

Contrary to their simulations, back in 2019, we carried out a simulation study to show that the between group heritability metric behaves as expected in simulated data. That is, when gaps are caused by genetics, the value is correctly found by application of DeFries’ formula, and also by admixture analysis. These simulations are not inconsistent with the new simulations by the authors, merely the same seen from a different perspective where one isn’t deliberately trying to make the metric look problematic.

Aside from their mathematical and simulation treatment of the between group heritability metric, the paper contains perhaps the first published criticism of admixture regression:

Under this formulation, it would be possible to use linear regression to estimate the slope (in terms of ancestry proportion θ) and intercept given a sample of admixed individuals with measured admixture fractions and phenotypes, but the resulting slope would provide information only about the sum of the genetic and environmental effects of global ancestry, not their relative contribution or even their direction.

A similar approach might be applied to admixed siblings, where the correlation with trait value is computed only with respect to variation in realized global ancestry between full siblings, who have the same expected global ancestry proportions on the autosomes. Limiting to within-sibship comparisons would remove some confounds of global ancestry proportion, but it would not exclude environmentally mediated phenotypic differences caused by variation in global ancestry. For example, in African Americans, colorism (Dixon & Telles, 2017) might lead to systematically different treatment as a function of ancestry fraction, even within sets of full siblings.

Really, though, their only point is that global ancestry itself can be correlated with non-genetic environmental factors, which cause the real difference in some phenotype. As such, a mere regression is not itself dispositive evidence of genetic causation. The main thing that they have to offer is speculation about colorism, that is, discrimination based on visual characteristics linked to race, chiefly skin coloration. Perhaps unknown to the authors, we of course already realized that genetic ancestry can be confounded by environmental factors such as these. In fact, in our meta-analysis of genetic ancestry and social status from 2017 we wrote:

Socioeconomic status (SES) inequalities between and within SIRE groups can lead to spurious associations of BGA [genetic ancestry] with medical outcomes when both are associated with SES (Marden et al., 2016). Thus, genetic traits associated with SES whose allele frequencies differ among ancestral groups can be misidentified as being associated with a specific medical outcome in admixture mapping. For this reason, controls for SES are not infrequently incorporated into such analyses. Narrative reports (e.g., González Burchard et al., 2005) have suggested that SES covaries with admixture such that, relative to European BGA, Amerindian and African BGA are associated with lower SES. If this is typically the case, it would be prudent for researchers to include reliable measures of SES as covariates in analyses to provide lower-bound estimates of the BGA-medical outcome associations. Moreover, it would be advisable to investigate the causal pathways mediating the BGA-SES associations, to identify possible unobserved non-genetic mediators of BGA-medical outcomes associations. However, no meta-analysis has been conducted to date to establish whether SES outcomes are associated with BGA in any consistent way.

For this reason, we have conducted a number of studies where we used measures of visual presentation — skin color, hair color, eye color — and checked whether these could explain the genetic ancestry findings. The answer is that visual presentation has zero detectable validity in explaining differences in intelligence once global genetic ancestry is taken into account. Here’s one example of such a model (Lasker et al 2019):

Thus, if we look at model 3a, we see that skin color itself has good predictive validity of intelligence (beta = -0.40, p < .001). However, when it is in a model along with genetic ancestry (model 3b), it has no validity at all (beta = -0.02, p > .10). We can also see in model 4 that even controlling for parental social status (SES), genetic ancestry continues to have a large effect (beta = 0.59, p < .001). Such a control for parental SES is, however, tricky to interpret. On a purely genetic model, since social status among the parents is itself mainly due to genetic factors, it is not a non-genetic cause of offspring intelligence. Rather, it represents a genetic effect measured indirectly, namely that holding parental SES constant, you are looking at relatively elite African Americans which are above average in their genetic potential for intelligence and SES. So one cannot interpret the decline in the slope for European ancestry when one controls for parental SES to necessarily mean that parental SES is a non-genetic environmental cause (confounder) of differences in intelligence. That would be the sociologist’s fallacy.

Another mistake by the authors above is to posit that sibling models are no more helpful than between family models. As a matter of fact, siblings have only a very small variation in their global ancestries, perhaps a few percentage points on average. These differences in global ancestry will be very weakly correlated with the visual presentation of the siblings because visual presentation differences races are caused by a relatively small number of well-known genes. As such, a sibling comparison model would be much less confounded than an ordinary between-family model, also of course because siblings share their rearing environment in general, so that one doesn’t have to statistically control for this. Unfortunately, due to the very small variation in global ancestry between siblings, it would take a very large sample size of such admixed people to conduct a study with sufficient precision. Perhaps upwards of 20k pairs of admixed Africans. This is perhaps possible to do if one pools data from all available studies of admixed Africans, including the Million Veterans Program. Due to the policing of access to large datasets, this analysis is unlikely to carried out in the next few years, but will surely be done within the next 20 years. A better idea is to rely on half-siblings that were reared together, but have much greater variation in global ancestry between them. Due to the unstable nature of African Americans’ families, datasets usually have a quite large number of such half-siblings, who both grew up with the mother. I don’t know whether current datasets would be sufficient for this analysis, but it’s perhaps the case.

Taking a step back and thinking about the new paper as part of the running debate, I think it’s a rather predictable move. Egalitarians promote methods when they like the results, and attack the very same methods when they no longer produce results they want. For decades egalitarians have been promoting admixture analysis as a relatively straightforward way to settle the debate. Here’s Richard Nisbett in 2009 (my emphasis of most relevant parts in the long quote):

Racial Ancestry and IQ
All of the research reported above is most consistent with the proposition that the genetic contribution to the black/white dif-ference is nil, but the evidence is not terribly probative one way or the other because it is indirect. The only direct evidence on the question of genetics concerns the racial ancestry of a given individual. The genes in the U.S. “black” population are about 2o percent European (Parra et al., 1998; Parra, Kittles, and Shriver, 1004). Some blacks have completely African ancestry, many have at least some European ancestry, and some—about to percent—have mostly European ancestry. Does it make a difference how African versus European a black person is? A hereditarian model demands that blacks with more European genes have higher IQs. Herrnstein and Murray (1994) and Rushton and Jensen (2005), as it happens, scarcely deal with this direct evidence.

[discussion of various studies]

So what do we have in the way of studies that examine the effects of racial ancestry—by far the most direct way to assess the contribution of genes versus the environment to the black/white IQ gap? We have one flawed adoption study with results consistent with the hypothesis that the gap is substantially genetic in origin, and we have two less-flawed adoption studies, one of which indicates slightly superior African genes and one of which suggests no genetic difference. We have downs of studies looking at racial ancestry as indicated by skin color and “negroidness” of features that provide scant support for the genetic theory. In addition, three different studies of Europeanness of blood groups, using two different designs, indicate no support for the genetic theory. One study of illegitimate children in Germany demonstrates no superiority for children of white fathers as compared to children of black fathers. One study shows that exceptionally bright “black children have no more European ancestry than the best-available estimate for the population as a whole. And one study indicates that A is more advantageous for a mixed-race child to be raised by a family having a white mother than by a family having a black mother. All of these racial ancestry studies are subject to alternative interpretations Most of these alternatives boil down to the possibility that there was self-selection for IQ in black-white unions. If whites who mated with blacks had much lower IQs than whites in general, their European genes would convey little IQ advantage. Similarly, if blacks who mated with whites had much higher IQs than blacks in general, their African genes would not have been a drawback. Yet the extent to which white genes contributing to mixed-race unions would have to be inferior to white genes in general, or black genes would have to be superior to black genes in general, would have to be very extreme to result in no IQ difference at all between children of purely African heritage and those of partially European origin. Moreover, self-selection by IQ was probably not very great during the slave era, when most black-white unions probably took place. It is unlikely, for example, that the white males who mated with black females had on average a lower IQ than other white males. Indeed, if such unions mostly involved white male slave-owners and black female slaves, which seems likely to be the case (Parra et al., 1998), and if economic status was slightly positively related to IQ (as it is now), thew whites probably had IQs slightly above average. The black female partners were nor likely chosen on the bask of IQ, as opposed to comeliness. Similarly, it scarcely seems likely that either black or white soldiers in World War II were selecting their German mates on the basis of IQ. Several studies, moreover, are immune to the self-selection hypothesis. In particular, the study involving black and white children raised in an institutional setting, and the study involving black children adopted into either black or white middle-class homes, could not be explained by self-selection for IQ in mating. In short, though one would never know it by reading Herrnstein and Murray’s book (1994) or Rushton and Jensen’s article (zoos), the great mass of evidence on racial ancestry—the only direct evidence we have—points toward no contribution at all of genetics to the black/white gap.

The reason that Nisbett was happy about the method is that the old blood type studies found weak correlations between global ancestry and intelligence. In hindsight, this is not surprising because these are only moderately useful proxies for genetic ancestry, and because within a given admixed group, the Pearson correlation between genetic ancestry and intelligence will be very weak (e.g. r = 0.09 in Lasker 2019), and this is true even if the entirety of the gap is caused by genetics. This is simply due to the range restriction in ancestry (i.e., variance reduction) and the large within population genetic variation in intelligence. As such, the weak, non-significant correlations found in relatively small samples based on blood group analysis are mostly uninformative about causes of group differences. The samples were simple too small to tell. People should have been reporting regression slopes, not correlations. The regression slope of a variable with a range of 0 to 100% genetic ancestry tells us the expected change in the phenotype given a 100% change in the ancestry.

Given the above, I expect that egalitarians will produce more papers attacking admixture regression as a method for studying genetic causation. This is despite the fact that this very same method is commonly used in genetic epidemiology, which deals with relatively uncontroversial differences such as hypertension or diabetes. I previously compiled a list of such mainstream research studies using the same method. What egalitarians never seem to do is to do any research they would consider actually informative about the causation of such gaps. Their output has a merchants of doubt feeling to it. It is not aimed at clarifying the relative importance of causes, rather it is made to create doubt about one particular cause, genetics, while not being rigorous about other favored causes (e.g. above, colorism is taken for granted despite being based on flimsy evidence). This is the same strategy that the tobacco industry followed when they wished to sow doubt on whether their products caused lung cancer. Of course, they followed this strategy for good capitalist reasons, namely that they wanted to avoid regulations of their products that would harm revenues. In the case of egalitarian research, the aim is to prevent unwelcome political policies that might be implemented if the public were to believe in genetic causation of group differences. Eric Turkheimer is very clear about this motivation (in 1990):

If it is ever documented conclusively, the genetic inferiority of a race on a trait as important as intelligence will rank with the atomic bomb as the most destructive scientific discovery in human history [emphasis always mine]. The correct conclusion is to withhold judgment.

Incidentally, Turkheimer had wrong-headed approach to this. He was considering the potential downsides of such widespread belief, but fails to consider the costs of belief in nonexistent environmental causes of such race differences. As such, his analysis is not really a true cost-benefit analysis, as it fails to consider the other potential outcomes. Jensen, again, was ahead of his time with a correct take on this matter. For more on that, see my prior post.