The sibling control design

A friend of mine and his brother just received their 23andme results.

In a table they look like this (I have added myself for comparison):

Macrorace	Bro1	Bro2	Emil
European	52.6	53	99.8
MENA	42.5	41.3	0.2
South Asian	2.8	3.4	0
East Asian & Amerindian	1.1	0.7	0
Sub-Saharan African	0.5	0.5	0
Oceanian	0.5	0	0
Unassigned	0	1.1	0.1

Sum	100	100	100.1


Mesorace	Bro1	Bro2	Emil
European
Northern	51.5	51.5	91.3
Southern	1	1.2	0
Ashkenazi	0.1	0	2.9
Eastern	0	0	4
Common European	0.1	0.4	1.5
MENA
Middle Eastern	42	40.8	0
North African	0.3	0.2	0.2
Common MENA	0.2	0.3	0
South Asian	2.8	3.4	0
East Asian & Amerindian
East Asian	0.7	0.4	0
Southeast Asian	0.2	0	0
Amerindian	0	0.1	0
Common East Asian & Amerindian	0.1	0.1	0
Sub-Saharan African
East	0.3	0.3	0
West	0.2	0.4	0
Central & South	0	0	0
Common Sub-Sahara African	0.1	0.1	0
Oceanian	0	0	0
Unassigned	0.5	1.1	0.1

Sum	100.1	100.3	100

Microrace	Bro1	Bro2	Emil
European
Northern
Scandinavian	21.3	24.2	37.3
French & German	10.5	14.9	0.8
British and Irish	8.9	4.9	11
Finnish	0	0	0.3
Common Northern	10.7	7.5	42
Southern
Italian	0.9	0.8	0
Sardinian	0	0	0
Iberian	0	0	0
Balkan	0	0	0
Common Southern	0.1	0.4	0
Ashkenazi	0.1	0	2.9
Eastern	0	0	4
Common European	0.1	0.4	1.5
MENA
Middle Eastern	42	40.8	0
North African	0.3	0.2	0.2
Common MENA	0.2	0.3	0
South Asian	2.8	3.4	0
East Asian & Amerindian
East Asian
Japanese	0.2	0	0
Mongolian	0.1	0.2	0
Korean	0	0	0
Yakut	0	0	0
Chinese	0	0	0
Common East Asian	0.5	0.2	0
Southeast Asian	0.2	0	0
Amerindian	0	0.1	0
Common East Asian & Amerindian	0.1	0.1	0
Sub-Saharan African
East	0.3	0.3	0
West	0.2	0.4	0
Central & South	0	0	0
Common Sub-Sahara African	0.1	0.1	0
Oceanian	0	0	0
Unassigned	0.5	1.1	0.1

Sum	100.1	100.3	100.1

Note that I have used data from all three zoom levels. Sometimes people will ask the nonsensical question “How many races are there?” Well, it depends on how much you want to zoom in. 23andme supports three zoom-levels. I have called the groups identified macro-, meso- and microraces.

So we see that the siblings are almost but not exactly the same. As Jason Malloy has pointed out, this is a very important fact because it allows for a sibling-control study akin to Murray (2002). In this design, researchers find full-siblings, measure some predictor variable(s) from each sibling and compare them on the outcome variable(s). This is an important design because it removes the common environment (between family effects) confound that make interpretation of regression results difficult, e.g. those in The Bell Curve (Herrnstein and Murray, 1994). Murray (2002) used each sibling’s IQ to predict socioeconomic outcomes at adulthood (age 30-38): income, marriage and birth out of wedlock. I reproduce the tables below:

The results are similar to the results from regression modeling presented in The Bell Curve. In other words, for this question, the effects were not due to the common environment confound.

The same design can be used for the question of whether racial ancestry predicts outcome variables such as general cognitive ability (g factor, IQ, etc.), income, educational attainment and crime rate. Since siblings differ somewhat in their ancestry (as was shown in the tables and figures above), then if the genetic hypothesis for the trait is true, then the differences in ancestry will slightly predict the level of the trait.

In practice for this to work, one will need a large sample of sibling sets (pairs, triples, etc.). To make it easy, they should not be admixture from more than 2 genetic clusters/races. So e.g. African Americans in the US are good for this purpose as they are mostly a mix of European and African genes, but there are other similar groups in the world: Colored in South Africa, Greenlanders in Denmark and Greenland (Moltke et al, 2015), admixed Hawaiians, basically everybody in South America (see admixture project, part I).

References

Herrnstein, R. J., & Murray, C. A. (1994). The bell curve: intelligence and class structure in American life (1st Free Press pbk. ed). New York: Simon & Schuster.
Murray, C. (2002). IQ and income inequality in a sample of sibling pairs from advantaged family backgrounds. American Economic Review, 339–343.
Moltke, I., Fumagalli, M., Korneliussen, T. S., Crawford, J. E., Bjerregaard, P., Jørgensen, M. E., … others. (2015). Uncovering the Genetic History of the Present-Day Greenlandic Population. The American Journal of Human Genetics, 96(1), 54–69.

You Might Also Like

Worldwide height variation explained

The many lies of Ben van der Merwe

Unfinished essay