Apparently, only a few studies have examined this question and they are not easily available. Because we obtained these, it makes sense to share our results. The datafile is on Google Drive. We will fill in more results as we find them.

So far results reveal nothing surprising: MZ correlations are larger than DZ correlations. The h2 using Falconer’s formula is about 60%. Total sample ≈ 750 pairs.

It is common to talk about traits being monogenic or polygenic. We say that Sickle-cell disease is monogenic because its heritable variation among humans can be accounted for by a single locus of genetic variation. Or more accurately, we say that 100% of the heritable variation can be accounted for by variation in that genetic locus (assuming the simple monogenic scenario, I did not look into the details). We say that height is strongly or highly polygenic because it seems like we need thousands of locuses of genetic variation to account for the heritable variation among humans. The largest study that I know of, Wood et al 2014, identified 697 variants (with p < alpha) and these accounted for only about 20% of heritable variation in human height. Furthermore, as explored in a prior post, the distribution of effect sizes of variants follow a power law-like distribution.

cut_30_beta_N_logX-axis has the beta of the SNP, y-axis has the log10 of the frequency.

Due to the way statistical power works, we find find the variants with the largest effects. Making the assumption that we find the SNPs in the exact order of their effect size (only roughly true, empirically resolvable by those with access to data), we should be able to derive estimates of the number of SNPs needed to explain any given proportion of the heritable variation. Probably, we cannot estimate this with certainty if the proportion is set at 1 (100%), but we should be able to find decent estimates for e.g. 10%, 20% and 50%. Thus, we can introduce a specific measure of the degree of polygenicity: the number of variants needed to explain n% of the heritable variance. I propose the term n% genetic cardinality for this concept. This sets all the traits on the same scale and it should be calculateable for any trait where there is a large GWAS.

An alternative measure would be to use the distribution parameters of the power law, but this would be more complicated to understand and estimate. The advantage is that it would be be less noisy.

Can we estimate some numbers for a few traits just to showcase the concept? Perhaps. There are some traits where we have found SNPs that explain most of the variation. If we set n=50, then we can find the values in papers discussing these traits perhaps.

Liu et al 2010 studied eye color in the Dutch and found that a 17 SNP model predicted about 50% of variance in a cross-validation sample. Actually, a single SNP accounted for ~46% variance in itself.


This means that for this trait, if we set n to be anything ~45 or below, the number will be 1. Most traits are not like this. However, I was not able to find more easy examples like this. Most studies just report a bunch of p values which are near useless

Liu et al 2015 report on skin color variation in Europeans and find that the top 9 SNPs explain 3-16% (depending on measure and sample).

skin color liu

Bergholdt et al 2012 writes that “The estimated proportion of heritability explained by currently identified loci is >80%”. However, I could not find support for this in the given reference (Wray et al 2010). I could not even find the number of SNPs they are talking about.

What is needed is for researchers to publish plots that show cumulative R2 (explained variance, preferably in an independent sample) by number of SNPs in order of largest effect sizes. This would allow for easy comparison of genetic cardinality of traits.


  • If one looked at a broader sample (racial heterogeneous), the genetic cardinality numbers would be generally smaller. It is easier to explain variation between white and black hair of Europeans and Africans than it is to explain smaller differences between white and brown hair among Europeans. Thus, the population must be kept roughly constant for the numbers to be comparable.
  • I don’t have time to try to figure out how to estimate the genetic cardinality from a published table of beta values. This should be possible if one is willing to make an assumption about the degree to which the effects are independent. I.e., one would first sort the SNPs by beta, then calculate the R2 values, then calculate the cumulative sum of R2s. Then fit the power law distribution to the number of SNPs used and the cumulative R2. This uses the assumption of no overlap in R2. The degree of overlap can be determined if one had access to case-level data, altho overlap in itself would be a function of of the number of SNPs. Complicated.

This may have some interest. Basically, typologists cannot into statistics and it shows. On the other hand, it means there is a large number of low hanging fruit for someone with skills in statistical programming.


This was handed in as a paper for typology class. Quite likely the last class I will take in linguistics. I don’t plan on actually getting the master’s degree.

Consider the model below:

General model for immigrant group traits and outcomes

Something much like this has been my intuitive working model for thinking about immigrant groups’ traits and socioeconomic outcomes. I will explain the model in this post and refer back to it or use the material in some upcoming paper (nothing planned).

The model shows the home country/a country of origin and two destination countries. The model is not limited to just two destination countries, but I did not draw more to avoid making the model larger. It can be worth using more in some cases which will be explained below.

Familial traits (or intergenerational) are those traits that run in families. This term includes both genetic and shared environmental effects. Because most children grow up with their parents (I assume), it does not matter whether the parents traits→children traits route is genetic or environmental. This means that both psychological traits (mostly genetic) and culturally traits (mostly shared environmental) such as specific religion are included.

When persons leave (emigrate) their home country, there is some selection: people who decide to leave are not random. Sometimes, it is not easy to leave because the government actively tries to restrict its citizens from leaving. This is shown in the model as the Emigration selection→Emigrant group familial traits link. Emigration selection seems to be mostly positive in the real world: the better off and smarter emigrate more than the poorer and less bright.

When the immigrants then move to other countries, there is Immigration selection because the destination countries usually don’t just allow whoever to move in if they want to. Immigration selection can have both positive and negative effects. Countries that receive refugees but try not to receive others have negative selection, while those that try to only pick the best potential immigrants have positive selection. Often countries have elements of both. Immigration selection and Emigrant group familial traits jointly lead to Immigrant group familial traits in a particular destination country.

Note that because immigrant selection is unique for each destination country, but can be similar for some countries. This would show up at correlated immigration selection scores. There is also immigration selection that doesn’t happen in the destination country, namely selection that happens due to geographical distance. For this reason I placed the Immigration selection node half in the destination country boxes. With a more complex model, one could split these if desired.

Worse, it is possible that immigration selection in a given country depends on the origin country, i.e. a country-country interaction selection. This wasn’t included in the above model. Examples of this are easy to find. For instance, within the EU (well, it’s complicated), there is relatively free movement of EU citizens, but not so for persons coming in from outside the EU.

Socioeconomic outcomes: Human capital model + luck

The S factor score of the home country (the general factor of socioeconomic outcomes, which one can think of as roughly equal to the Human Development Index just broader ) is modeled as being the outcome of the Population familial traits and Environmental and historical luck . I think it is mostly the former. Perhaps the most obvious example of environmental luck is having valuable natural resources in your borders, today especially oil. But note that even this is somewhat complicated because borders can change by use of ‘bigger army diplomacy’ or by simply purchasing more land, so one could strategically buy or otherwise acquire land that has valuable resources on it, making it not a strict environmental effect.

Other things could be having access to water, sunlight, wind, earthquakes, mountains, large bodies of inland water & rivers, active underground, arable land, living close to peaceful (or not so much) neighbors and so on. These things can promote or retard economic development. Having suitable rivers means that one can get cheap and safe (well, mostly) energy from those. Countries without such resources have to look elsewhere which may cost more. They are not always strictly environmental, but some amount of their variance is more or less randomly distributed to countries. Some are more lucky than others.

There are some who argue that countries that were colonized are better off now because of it, so that would count as historical luck . However, being colonized is not just an environmental effect because it means that foreign powers were able to defeat your forces overwhelmingly for decades. If they were able to, you probably had a poor military which is linked to general technological development. There is some environmental component to whether you have a history of communism, but it seems to still have negative effects on economic growth decades after.

For immigrant groups inside a host country, however, the environmental effects with country-wide effects cannot account for differences. These are thus due to familial effects only (by a good approximation). To be sure, the other people living in the destination/host country, Other group familial traits, probably have some effect on the Immigrant familial traits as well , such as religion and language. These familial traits and the Other group S then jointly cause the Immigrant group S. This is the effect that Open Borders advocates often talk about one aspect of:

Wage differences are a revealing metric of border discrimination. When a worker from a poorer country moves to a richer one, her wages might double, triple, or rise even tenfold. These extreme wage differences reflect restrictions as stifling as the laws that separated white and black South Africans at the height of Apartheid. Geographical differences in wages also signal opportunity—for financially empowering the migrants, of course, but also for increasing total world output. On the other side of discrimination lies untapped potential. Economists have estimated that a world of open borders would double world GDP.

Paths estimated in studies

A path model is always complete which means that all causal routes are explicitly specified. All the remaining links are non-causal, but nodes can be substantially correlated. For instance, there is no link between the home country Country S and immigrant group S but these are strongly correlated in practice. I previously reported correlations between home Country S and Immigrant group S of .54 and .72 for Denmark and Norway .

There is no link between home country Population familial traits and Immigrant group familial traits, but there is only one link in between (Emigrant group familial traits), so seems reasonable to try to correlate these two nodes. A few studies have looked at these type of correlations. For instance, John Fuerst have looked at GRE/GMAT scores and the like for immigrant groups in the US . This is taken as a proxy for cognitive ability, probably the most important component of the psychological traits part of familial traits. In that paper, Fuerst found correlations of .78 and .81 between these and country cognitive ability using Lynn and Vanhanen’s dataset .

Rindermann and Thompson have reported correlations between cognitive ability (component of Immigrant group familial traits) and native population cognitive ability (component of Other group familial traits) .

Most of my studies have looked at the nodes Population familial traits (sub-components Islam belief and cognitive ability) and Immigrant group S (or sub-components like crime if S was not available). Often this results in large correlations: .54 and .59 for Denmark and Norway (depending on how to deal with missing data, use of weighted correlations etc.). Note that in the model the first does cause the second, but there are a few intermediate steps and other variables, especially Emigrant selection (differs by country of origin which reduces the correlation) and Immigrant selection (which has no effect on the correlation).

There is much to be done. If one could obtain estimates of multiple nodes in a causal chain, one could use mediation analysis to see if mediation is plausible. E.g. right we we have Immigrant group S for two countries, cognitive ability for 100s of countries of origin, so if we could obtain immigrant group cognitive ability, one could test the mediation role of the last. With the current data, one can also check whether country of origin cognitive ability mediates the relationship between immigrant group S and country of origin S, which it should partly, according to the model. I say partly because the mediation is only to the extend that familial cognitive ability is a cause.


In a recent post, Steve Hsu writes:

It’s a shame that we don’t have a better online platform (e.g., like Quora or StackOverflow) for discussing scientific papers. This would allow the authors of a paper to communicate directly with interested readers, immediately after the paper appears.

The best current service I can think of is Reddit. Reddit has tons of subreddits, some of which are concerned with scientific papers. So, for instance, a paper might be posted on r/psychology and get discussion there. However, there is no unified system, so the same paper might get posted several other places independently (this paper has been posted to 5 subreddits as of writing this post). The authors of the paper may not know of the discussion and so will not answer any questions people might have, such as requests for additional analyses which especially relevant when data are not public (most of the time) or analysis is difficult.

One could automatically generate a thread for every published paper. I think this is a bad idea because most papers will get no interest at all (or citations for that matter) and it would necessitate an enormous database. However, one idea is to generate a thread for a paper as soon as a user expresses desire to discuss that paper. This can be done automatically by information extracted from the journal website (metadata, see e.g. Google Scholar’s recommendations), indirectly via the DOI or manually entered (this is however a problem because it opens the door to spammers). Upon generation of the thread and some minimum level of activity in it, one could notify the authors of the paper automatically (using their contact emails in the paper) that someone is discussing it on the site.

The good thing about the internet is that everybody can contribute which is also the bad thing. This means that good contributions can come from anywhere, and lots of useless or counterproductive contributions will also come. The goal is to read the good contributions and to not read the bad contributions. The general solution to this problem is filtering. This can be done both as part of the system (like Reddit’s up and down vote-system), or using client-side scripts. Preferably, the system itself should have multiple ways of filtering the content so that users can pick a filter that gives an output fairly close to their desired output. Reddit has a few options: hot, new, controversial, top, but one could easily add more.

How feasible is such a proposed system? Very feasible, probably even I could set it up if given a bit of time. In practical terms, what would it cost to set it up? One would need to hire some programmers with the relevant expertise (not so cheap) and buy some cloud hosting (fairly cheap).

This was an exchange between researchers that took place in 2006 in the academic journal Intelligence (34).

  1. Templer, D. I., & Arikawa, H. (2006). Temperature, skin color, per capita income, and IQ: An international perspective. Intelligence, 34(2), 121-139.
  2. Jensen, A. R. (2006). Comments on correlations of IQ with skin color and geographic–demographic variables. Intelligence, 34(2), 128-131.
  3. Hunt, E., & Sternberg, R. J. (2006). Sorry, wrong numbers: An analysis of a study of a correlation between skin color and IQ. Intelligence, 34(2), 131-137.
  4. Templer, D. I., & Arikawa, H. (2006). The Jensen and the Hunt and Sternberg comments: From penetrating to absurd. Intelligence, 34(2), 137-139.

Readers more curious read later works on the topic, some of which include (no particular order). The list includes both proponents and critics:

  • Hunt, E., & Carlson, J. (2007). Considerations relating to the study of group differences in intelligence. Perspectives on Psychological Science, 2(2), 194-213.
  • Templer, D. I. (2008). Correlational and factor analytic support for Rushton’s differential K life history theory. Personality and Individual Differences, 45(6), 440-444.
  • Rushton, J. P., & Templer, D. I. (2009). National differences in intelligence, crime, income, and skin color. Intelligence, 37(4), 341-346.
  • Pesta, B. J., & Poznanski, P. J. (2014). Only in America: Cold Winters Theory, race, IQ and well-being. Intelligence, 46, 271-274.
  • Lynn, R. (2006). Race differences in intelligence: An evolutionary analysis. Washington Summit Publishers.
  • Eppig, C., Fincher, C. L., & Thornhill, R. (2010). Parasite prevalence and the worldwide distribution of cognitive ability. Proceedings of the Royal Society of London B: Biological Sciences, 277(1701), 3801-3808.
  • Kanazawa, S. (2008). Temperature and evolutionary novelty as forces behind the evolution of general intelligence. Intelligence, 36(2), 99-108.
  • Wicherts, J. M., Borsboom, D., & Dolan, C. V. (2010). Evolution, brain size, and the national IQ of peoples around 3000 years BC. Personality and Individual Differences, 48(2), 104-106.
  • Templer, D. I., & Stephens, J. S. (2014). The relationship between IQ and climatic variables in African and Eurasian countries. Intelligence, 46, 169-178.
  • Lynn, R., & Vanhanen, T. (2012). Intelligence: A unifying construct for the social sciences. Ulster Institute for Social Research.

Chisala has his 3rd installment up:

One idea I had while reading it was that tail effects interact with population ethnic/racial heterogeneity. To show this, I did a simulation experiment. Population 1 is a regular population with a mean of 0 and sd of 1. Population 2 is a composite population of three sub-populations: one with a mean of 0 (80%; “normals”) one with mean of -1 (10%; “dullards”) and one with a mean of 1 (10%; “brights”). Population 3 is a normal population but with a slightly increased sd so that it is equal to the sd of population 2.

Descriptive stats:

> describe(df, skew = F, ranges = T)
     vars     n mean  sd median trimmed  mad   min  max range se
pop1    1 1e+06    0 1.0      0       0 1.00 -4.88 4.65  9.53  0
pop2    2 1e+06    0 1.1      0       0 1.09 -5.43 5.37 10.80  0
pop3    3 1e+06    0 1.1      0       0 1.09 -5.30 5.13 10.44  0

We see that the sd is increased a bit in the composite population (2) as expected. We also see that the range is somewhat increased, even compared to population 3 which has the same sd.

How do the tails look like?

> sapply(df, percent_cutoff, cutoff = 1:4)
      pop1     pop2     pop3
1 0.158830 0.179495 0.180856
2 0.022903 0.034342 0.034074
3 0.001314 0.003326 0.003126
4 0.000036 0.000160 0.000150

We are looking at the proportions of persons with scores above 1-4 (rows) by each population (cols). What do we see? Population 2 and 3 have clear advantages over population 1, but population 2 has a slight advantage over population 3 too.

Simulation 2

In the above, the composite population is made out of 3 populations. But what if it were instead made out of 5?


> describe(df, skew = F)
     vars     n mean   sd median trimmed  mad   min  max range se
pop1    1 1e+06    0 1.00      0       0 1.00 -4.88 4.65  9.53  0
pop2    2 1e+06    0 1.27      0       0 1.21 -5.91 6.03 11.94  0
pop3    3 1e+06    0 1.27      0       0 1.26 -6.12 5.92 12.04  0

The sd is clearly increased. There is not much difference in the range, but the range is very susceptible to sampling error, which we have. How do the tails look like?

> sapply(df, percent_cutoff, cutoff = 1:4)
      pop1     pop2     pop3
1 0.158830 0.205814 0.214353
2 0.022903 0.057077 0.056874
3 0.001314 0.011057 0.008872
4 0.000036 0.001246 0.000804

We see strong effects. At the +3 level, there are roughly 10x as many persons in the composite population as in the normal population. Population 3 also has more, but clearly fewer than the composite population.

We can conclude that one must take heterogeneity of populations into account when thinking about the tails.

R code

You can re-do the experiment yourself with this code, or try out some other numbers.

p_load(reshape, kirkegaard, psych)

n = 1e6

# first simulation ——————————————————–
pop1 = rnorm(n)
pop2 = c(rnorm(n*.8), rnorm(n*.1, 1), rnorm(n*.1, -1))
pop3 = rnorm(n, sd = sd(pop2))

df = data.frame(pop1, pop2, pop3)

describe(df, skew = F)
sapply(df, percent_cutoff, cutoff = 1:4)

# second simulation ——————————————————-
pop1 = rnorm(n)
pop2 = c(rnorm(n*.70), rnorm(n*.10, 1), rnorm(n*.10, -1), rnorm(n*.05, 2), rnorm(n*.05, -2))
pop3 = rnorm(n, sd = sd(pop2))

df = data.frame(pop1, pop2, pop3)

describe(df, skew = F)
sapply(df, percent_cutoff, cutoff = 1:4)

Van Ijzendoorn, M. H., Juffer, F., & Poelhuis, C. W. K. (2005). Adoption and cognitive development: a meta-analytic comparison of adopted and nonadopted children’s IQ and school performance. Psychological bulletin, 131(2), 301.

It turns out that someone already did a meta-analysis of adoption studies and cognitive ability. It does not solely include cross-country transracial, but it does include some. They report both country of origin and country of adoption, so it is fairly easy to find the studies that one wants to take a closer look at. It is fairly inclusive in what counts as cognitive development, e.g. school results and language tests count, as well as regular IQ tests. They report standardized differences (d), so results are easy to understand.

They do not present aggregated results by country of origin however, so one would have to do that oneself. I haven’t done it (yet?), but the method to do so is this:

  1. Obtain the country IQs for all countries in the study. These are readily available from Lynn & Vanhanen (2012) or in the international megadataset.
  2. Score all the outcomes by using the adoptive country’s IQ. E.g. if the US has a score of 97, and Koreans adopted to that country have get a d score of .16 in “School results” as they do in the first study listed, then this corresponds to a school IQ performance of 97 – 2.4 = 94.6. Note that this assumes that the comparison sample is unselected (not smarter than average). This is likely false because adoptive parents tend to be higher social class and presumably smarter, so they would send their (adoptive) children to above average schools. Also be careful about norm comparisons because they often use older norms and the Flynn effect thus results in higher IQ scores for the adoptees.
  3. Copy relevant study characteristics from the table, e.g. comparison group, sample sizes, age of assessment and type of outcome (school, language, IQ, etc.).
  4. Repeat step (2-3) for all studies.
  5. BONUS: Look for additional studies. Do this by, a) contacting the authors of recent papers and the meta-analysis, b) search for more using Google Scholar/other academic search engine, c) look thru the studies that cite the included studies for more relevant studies.
  6. BONUS: Get someone else to independently repeat steps (2-3) for the same studies. This checks interrater consistency.
  7. Aggregate results (weighted mean of various kinds).
  8. Correlate aggregated results with origin countries’ IQs to check for spatial transferability, a prediction of genetic models.
  9. Do regression analyses to see of study characteristics predict outcomes.
  10. Write it up and submit to Open Differential Psychology (good guy) or Intelligence (career-focused bad guy). Write up to Winnower or Human Varieties if lazy or too busy.

The main results table

More likely, you are too lazy to do the above, but you want to sneak peak at the results. Here’s the main table from the paper.

Study Country/region of study Country/region of child’s origin Age at assessment (years) Age at adoption (months) N Adoption N Comparison Preadoption status Comparison group Outcome (d)
Andresen (1992) Norway Korea 12-18 12-24 135 135 Not reported Classmates School results 0.16 Language 0.09
Benson et al. (1994) United States United States 12-18 < 15 881 Norm Not reported Norm group School results —0.36
Berg-Kelly & Eriksson (1997) Sweden Korea/India 12-18 < 12 125 9204 Not reported General population School results 0.03 f/—0.04 m Language —0.02 f/—0.05 m
Bohman (1970) Sweden Not reported 12-18 < 12 160 1819 Not reported Classmates School results 0.09 f/0.07 m Language 0.02 f/—0.02 m Learning problems 0.00
Brodzinsky et al. (1984) United States United States 4-12 < 12 130 130 Not reported General population School competence 0.62 f/0.51 m
Brodzinsky & Steiger (1991) United States Not reported 9-19 441 6753 Not reported Population % School failure 0.76
Bunjes & de Vries (1988) Netherlands Korea
4-12 12-24 118 236 Not reported Classmates School results 0.24 Language 0.22
Castle et al. (2000) England England 4-12 < 12 52 Norm Not reported Standardized scores School results —0.47, IQ 0.47
Clark & Hanisee (1982) United States Vietnam
0-4 12-24 25 Norm Not reported Standardized scores IQ -2.42
Colombo et al. (1992) Chile Chile 4-12 0-12 16 ii Undernutrition Biological siblings IQ -1.16
Cook et al. (1997) Europe Not reported 4-8 12-24 131 125 Not reported General population School competence 0.56 f/0.16 m
Dalen (2001) Norway Korea
12-18 0-12 193 193 Not reported Classmates School results 0.47 (Colombia), —0.07 (Korea)
Language 0.43 (Colombia), —0.05 (Korea)
Learning problems 0.50
Dennis (1973) United States Lebanon 2-18 > 24 85 51 Institute Institute children IQ —1.28 (intraracial), —1.36 (transracial)
De Jong (2001) New Zealand Romania/Russia 4-15 12-24 116 Norm Some problems General population School competence 0.65
Duyme (1988) France France 12-18 < 12 87 14951 Not reported General population School results 0.00
Fan et al. (2002) United States United States 12-18 514 17241 Not reported General population School grades —0.02
Feigelman (1997) United States Not reported 8-21 101 6258 Not reported General population Education level —0.03
Fisch et al. (1976) United States United States 4-12 < 12 94 188 No problems General population IQ 0.00
School results 0.50 Language 0.52
Frydman & Lynn
Belgium Korea 4-12 12-24 19 Norm Not reported Standardized scores IQ -1.68
Gardner et al. (1961) United States Not reported 12-18 < 12 29 29 Not reported Classmates School achievement 0.09
Geerars et al. (1995) Netherlands Thailand 12-18 < 12 68 Norm Not reported Population % School results 0.19
Hoopes et al. (1970) United States United States 12-18 100 100 1-2 shifts in placement General population IQ 0.12
Hoopes (1982) United States United States 4-12 < 12 260 68 Nothing special General population IQ 0.18
Horn et al. (1979) United States United States 3-26 < 1 469 164 No problems Environment siblings IQ 0.17/0.34/—0.05
W. J. Kim et al. (1992) United States Not reported 12-18 43 43 Not reported General population School results 0.74
W. J. Kim et al. (1999) United States Korea 4-12 < 12 18 9 Nothing special Environment siblings School competence —0.39
Lansford et al. (2001) United States Not reported 12-18 111 200 Not reported General population School grades 0.46
Leahy (1935) United States United States 5-14 < 6 194 194 Not reported General population School grades 0.00 IQ -0.06
Levy-Shiff et al. (1997) Israel Israel
South America
7-13 < 3 5050 Norm
Not reported Standardized scores IQ -1.10 f/—2.00 m
Lien et al. (1977) United States Korea 12-18 > 24 240 Norm Undernutrition Standardized scores IQ 0.00
Lipman et al. (1992) Canada Not reported 4-16 104 3185 Not reported General population School performance —0.05 f/0.16 m
McGuinness & Pallansch (2000) United States Soviet Union 4-12 > 24 105 1000 Long time in orphanages Norm group School competence 0.46
Moore (1986) United States United States 7-10 12-24 23 Norm Not reported Standardized scores IQ -0.00 f/—1.00 m
Morison & Ell wood (2000) Canada Romania 4-12 12-24 59 35 Orphanages General population IQ 1.45 (combined)
Neiss & Rowe (2000) United States 75% LTnited States 12-18 392 392 Not reported General population IQ 0.08
O’Connor et al. (2000) England Romania 6 0-42 207 Norm Orphanage Standardized scores IQ —0.56 (combined)
Palacios & Sanchez (1996) Spain Spain 4-12 > 24 210 314 Not reported Institute children School competence —0.18
Pinderhughes (1998) United States United States 8-15 24—48 66 33 Older children General population School competence 0.64 (combined)
Plomin & DeEries (1985) United States United States 1 0-5 182 182 Not reported General population IQ 0.14
Priel et al. (2000) Israel 75% Israel 8-12 12-24 50 80 Not reported General population School competence 0.77 f/1.12 m
Rosenwald (1995) Australia 73% Korea Asia
South America
4-16 < 12 283 2583 Not reported General population School performance —0.18
Scarr & Weinberg (1976) United States 88% LTnited States 4-16 <12 176 145 Not reported Environment siblings IQ 0.75 (combined)
Schiff et al. (1978) France France 4-12 <12 32 20 Not reported Biological siblings School results —0.70
Segal (1997) United States United States 4-12 < 12 6 6 Not reported Environment siblings IQ -1.14 IQ 2.67
Sharma et al. (1996) United States 81% United States 12-18 12-24 4682 4682 Not reported General population School results 0.37 (combined)
Sharma et al. (1998) United States United States 12-18 < 12 629 72 Not reported Environment School competence —0.45 f/—0.61 m
Silver (1970) United States Not reported 4-12 < 3 10 70 Not reported General population Learning problems 1.21
Silver (1989) United States Not reported 4-12 39 Perc. Not reported General population Learning problems 1.38
Skodak & Skeels (1949) United States Not reported 12-18 < 6 100 100 Not reported Standardized scores IQ -1.12
Smyer et al. (1998) Sweden Not reported Adults < 12 60 60 Not reported Biological (twin siblings) Education level —0.82
Stams et al. (2000) Netherlands Sri Lanka
4-12 < 6 159 Norm Not reported Standardized scores School results 0.33 IQ -0.34 f/—0.73 m Learning problems —0.05
Teas dale & Owen (1986) Denmark Not reported Adults < 12 302 4578 Not reported General population IQ 0.35
Education level 0.32
Tizard & Hodges (1978) England Not reported 8 > 24 25 14 Not reported Restored children IQ —0.40 (older), —0.62 (younger)
Tsitsikas et al. (1988) Greece Greece 5-6 < 12 72 72 Not reported Classmates IQ 0.64, school performance 0.29 Language 0.30
Verhulst et al. (1990) Netherlands Europe
12-18 > 24 2148 933 Not reported General population Perc. special education 0.25 f/0.29 m
Versluis-den Bieman & Verhulst (1995) Netherlands Europe
12-18 > 24 1538 Norm Not reported General population School competence 0.28 f/0.41 m
Wattier & Frydman (1985) Belgium Korea 89% 4-12 12-24 28 Norm Not reported Standardized scores IQ -0.06
Westhues & Cohen (1997) Canada Korea 40% India 40% South America 12-18 12-24 134 83 Not reported Environment siblings School performance 0.13
Wickes & Slate (1997) United States Korea > 18 > 36 174 Norm Not reported Norm group School results 0.09 f/0.07 m Language 0.07 f/0.03 m
Winick et al. (1975) United States Korea 4-12 > 24 112 Norm Malnourished Standardized scores School performance 0.00 IQ 0.00
Witmer et al. (1963) United States United States 12-18 < 12 484 484 Nothing special Classmates School performance 0.00 IQ 0.00

I found this one a long time ago and tweeted it, but apparently forgot to blog it.

Odenstad, A., Hjern, A., Lindblad, F., Rasmussen, F., Vinnerljung, B., & Dalen, M. (2008). Does age at adoption and geographic origin matter? A national cohort study of cognitive test performance in adult inter-country adoptees. Psychological Medicine, 38(12), 1803-1814.

Background Inter-country adoptees run risks of developmental and health-related problems. Cognitive ability is one important indicator of adoptees’ development, both as an outcome measure itself and as a potential mediator between early adversities and ill-health. The aim of this study was to analyse relations between proxies for adoption-related circumstances and cognitive development.
Method Results from global and verbal scores of cognitive tests at military conscription (mandatory for all Swedish men during these years) were compared between three groups (born 1968–1976): 746 adoptees born in South Korea, 1548 adoptees born in other non-Western countries and 330 986 non-adopted comparisons in the same birth cohort. Information about age at adoption and parental education was collected from Swedish national registers.
Results South Korean adoptees had higher global and verbal test scores compared to adoptees from other non-European donor countries. Adoptees adopted after age 4 years had lower test scores if they were not of Korean ethnicity, while age did not influence test scores in South Koreans or those adopted from other non-European countries before the age of 4 years. Parental education had minor effects on the test performance of the adoptees – statistically significant only for non-Korean adoptees’ verbal test scores – but was prominently influential for non-adoptees.
Conclusions Negative pre-adoption circumstances may have persistent influences on cognitive development. The prognosis from a cognitive perspective may still be good regardless of age at adoption if the quality of care before adoption has been ‘good enough’ and the adoption selection mechanisms do not reflect an overrepresentation of risk factors – both requirements probably fulfilled in South Korea.

I summarize and comment on the findings below:

Which adoptees?

In total, 2294 inter-country adoptees were born outside the Western countries (Europe, North America and Australia) and adopted before age 10 years. Of these, 746 were born in South Korea [Korean adoptee (KA) group]. The remaining 1548 individuals were born in other countries, Non-Korean adoptee (NKA) group. India was the most common country of origin, followed by Thailand, Chile, Ethiopia, Colombia and Sri Lanka. These were the only donor countries for which the number of adoptees included in this study exceeded 100. The non-adopted population (NAP) group consisted of non-adopted individuals born in Sweden (n=330 896).

Unfortunately, no more detailed information is given so a origin country IQ x adoptee IQ study (spatial transferability) cannot be done.

Main results


We see that Koreans adoptees do better than Swedes, even on the verbal test. The superiority stops being p<alpha when they control for various things. Notice that the disadvantage for non-Koreans becomes larger after control (their scores decrease and the Swedes’ scores increase).

Age at adoption matters, but apparently only for non-Koreans

age at adoption

This is in line with environmental cumulative disadvantage for non-Koreans. Alternatively, it is due to selection bias in that the less bright children (in the origin countries) are adopted later.

Perhaps the Koreans were placed with the better parents and this made them smarter?

Maybe, but the data shows that it isn’t important, even for transracial adoptives.

parental edu and IQ

Notice the clear relationship between child IQ and parental education for the non-adopted population. Then notice the lack of a clear pattern among the adoptives. There may be a slight upward trend (for Koreans), but it is weak (only .22 between lowest and highest education for Koreans, giving a d≈.10) and not found for non-Koreans (middle education-level had highest scores).

Still, one could claim that in Korean, smarter/normal children are given up for adoption, while in non-Korea non-Western Europe, this isn’t the case or even the opposite is the case. This study cannot address this possibility.

This study is much larger than other studies and also has a comparison group. The main problem with it is that it does not report data for more countries of origin. Only the (superior) Koreans are singled out.

It seems that no one has integrated this literature yet. I will take a quick stab at it here. It could be expanded into a proper paper later in case someone wants to and have time to do that.


Lee Jussim (also blog) has done a tremendous job at reviewing the stereotype in recently years. In general he has found that stereotypes are mostly moderately to very accurate. On the other hand, self-fulfilling prophecies are probably real but fairly limited (e.g. work best when teachers don’t know their students well yet), especially in comparison to stereotype accuracy. Of course, these findings are exactly the opposite of what social psychologists, taken as a group, have been telling us for years.

The best short review of the literature is their book chapter The Unbearable Accuracy of Stereotypes. A longer treatment can be found in his 2012 book Social Perception and Social Reality: Why Accuracy Dominates Bias and Self-Fulfilling Prophecy (libgen).

Occupational success and cognitive ability

Society is more or less a semi-stable hierarchy biased on mostly inherited personality traits, cognitive ability as well as some family-based advantage. This shows up in the examination of surnames over time in many countries, as documented in Gregory Clark’s book The Son Also Rises: Surnames and the History of Social Mobility (libgen). One example:

sweden stability

Briefly put, surnames are kind of an extended family and they tend to keep their standing over time. They regress towards the mean (not the statistical kind!), but slowly. This is due to outmarrying (marrying people from lower classes) and genetic regression (i.e. predicted via breeder’s equation and due to the fact that narrow heritability and shared environment does not add up to 1).

It also shows up when educational attainment is directly examined with behavioral genetic methods. We reviewed the literature recently:

How do we find out whether g is causally related to later socioeconomic status? There are at least five lines of evidence: First, g and socioeconomic status correlate in adulthood. This has consistently been found for so many years that it hardly bears repeating[22, 23]. Second, in longitudinal studies, childhood g is a good correlate of adult socioeconomic status. A recent meta-analysis of longitudinal studies found that g was a better correlate of adult socioeconomic status and income than was parental socioeconomic status[24]. Third, there is a genetic overlap of causes of g and socioeconomic status and income[25, 26, 27, 28]. Fourth, multiple regression analyses show that IQ is a good predictor of future socioeconomic status, income and more, even controlling for parental income and the like[29]. Fifth, comparisons between full-siblings reared together show that those with higher IQ tend to do better in society. This cannot be attributed to shared environmental factors since these are the same for both siblings[30, 31].

I’m not aware of any behavioral genetic study of occupational success itself, but that may exist somewhere. (The scientific literature is basically a very badly standardized, difficult to search database.) But clearly, occupational success is closely related to income, educational attainment, cognitive ability and certain personality traits, all of which show substantial heritability and some of which are known to correlate genetically.

Occupations and cognitive ability

An old line of research shows that there is indeed a stable hierarchy in occupations’ mean and minimum cognitive ability levels. One good review of this is Meritocracy, Cognitive Ability,
and the Sources of Occupational Success, a working paper from 2002. I could not find a more recent version. The paper itself is somewhat antagonistic against the idea (the author hates psychometricians, in particular dislikes Herrnstein and Murray, as well as Jensen) but it does neatly summarize a lot of findings.

occu IQ 1

occu IQ 2

occu IQ 3

occu IQ 4

occu IQ 5

occu IQ 6

occu IQ 7

The last one is from Gottfredson’s book chapter g, jobs, and life (her site, better version).

Occupations and cognitive ability in preparation

Furthermore, we can go a step back from the above and find SAT scores (almost an IQ test) by college majors (more numbers here). These later result in people working in different occupations, altho the connection is not always a simple one-to-one, but somewhere between many-to-many and one-to-one, we might call it a few to a few. Some occupations only recruit persons with particular degrees — doctors must have degrees in medicine — while others are flexible within limits. Physics majors often don’t work with physics at their level of competence, but instead work as secondary education teachers, in the finance industry, as programmers, as engineers and of course sometimes as physicists of various kinds such as radiation specialists at hospitals and meteorologists. But still, physicists don’t often work as child carers or psychologists, so there is in general a strong connection between college majors and occupations.

There is some stereotype research into college majors. For instance, a recently popularized study showed that beliefs about intellectual requirements of college majors correlated with female% of the field, as in, the harder fields perceived to be more difficult had fewer women. In fact, the perceived difficulty of the field probably just mostly proxies the actual difficulty of the field, as measured by the mean SAT/ACT score of the students. However, no one seems to have actually correlated the SAT scores with the perceived difficulty, which is the correlation that is the most relevant for stereotype accuracy research.

There is a catch, however. If one analyses the SAT subtests vs. gender%, one sees that it is mostly the quantitative part of the SAT that gives rise to the SAT x gender% correlation. One can also see that the gender% correlates with median income by major.

quant-by-college-major-gender verbal-by-college-major-gender

Stereotypes about occupations and their cognitive ability

Finally, we get to the central question. If we ask people to estimate the cognitive ability of persons by occupation and then correlate this with the actual cognitive ability, what do we get? Jensen summarizes some results in his 1980 book Bias in Mental Testing (p. 339). I mark the most important passages.

People’s average ranking of occupations is much the same regardless of the basis on which they were told to rank them. The well-known Barr scale of occupations was constructed by asking 30 “ psychological judges” to rate 120 specific occupations, each definitely and concretely described, on a scale going from 0 to 100 according to the level of general intelligence required for ordinary success in the occupation. These judgments were made in 1920. Forty-four years later, in 1964, the National Opinion Research Center (NORC), in a large public opinion poll, asked many people to rate a large number of specific occupations in terms of their subjective opinion of the prestige of each occupation relative to all of the others. The correlation between the 1920 Barr ratings based on the average subjectively estimated intelligence requirements of the various occupations and the 1964 NORC ratings based on the average subjective opined prestige of the occupations is .91. The 1960 U.S. Census o f Population: Classified Index o f Occupations and Industries assigns each of several hundred occupations a composite index score based on the average income and educational level prevailing in the occupation. This index correlates .81 with the Barr subjective intelligence ratings and .90 with the NORC prestige ratings.

Rankings of the prestige of 25 occupations made by 450 high school and college students in 1946 showed the remarkable correlation of .97 with the rankings of the same occupations made by students in 1925 (Tyler, 1965, p. 342). Then, in 1949, the average ranking of these occupations by 500 teachers college students correlated .98 with the 1946 rankings by a different group of high school and college students. Very similar prestige rankings are also found in Britain and show a high degree of consistency across such groups as adolescents and adults, men and women, old and young, and upper and lower social classes. Obviously people are in considerable agreement in their subjective perceptions of numerous occupations, perceptions based on some kind of amalagam of the prestige image and supposed intellectual requirements of occupations, and these are highly related to such objective indices as the typical educational level and average income of the occupation. The subjective desirability of various occupations is also a part of the picture, as indicated by the relative frequencies of various occupational choices made by high school students. These frequencies show scant correspondence to the actual frequencies in various occupations; high-status occupations are greatly overselected and low-status occupations are seldom selected.

How well do such ratings of occupations correlate with the actual IQs of the persons in the rated occupations? The answer depends on whether we correlate the occupational prestige ratings with the average IQs in the various occupations or with the IQs of individual persons. The correlations between average prestige ratings and average IQs in occupations are very high— .90 to .95—when the averages are based on a large number of raters and a wide range of rated occupations. This means that the average of many people’s subjective perceptions conforms closely to an objective criterion, namely, tested IQ. Occupations with the highest status ratings are the learned professions—physician, scientist, lawyer, accountant, engineer, and other occupations that involve high educational requirements and highly developed skills, usually of an intellectual nature. The lowest-rated occupations are unskilled manual labor that almost any able-bodied person could do with very little or no prior training or experience and that involves minimal responsibility for decisions or supervision.

The correlation between rated occupational status and individual IQs ranges from about .50 to .70 in various studies. The results of such studies are much the same in Britain, the Netherlands, and the Soviet Union as in the United States, where the results are about the same for whites and blacks. The size of the correlation, which varies among different samples, seems to depend mostly on the age of the persons whose IQs are correlated with occupational status. IQ and occupational status are correlated .50 to .60 for young men ages 18 to 26 and about .70 for men over 40. A few years can make a big difference in these correlations. The younger men, of course, have not all yet attained their top career potential, and some of the highest-prestige occupations are not even represented in younger age groups. Judges, professors, business executives, college presidents, and the like are missing occupational categories in the studies based on young men, such as those drafted into the armed forces (e.g., the classic study of Harrell & Harrell, 1945).

I predict that there is a lot of delicious low-hanging, ripe research fruit ready for harvest in this area if one takes a day or ten to dig up some data and read thru older papers, books and reports.