As Jason Malloy has mentioned, it is strange that in the race intelligence debates, people usually cite the same few studies over and over:

Shortly after writing that post, I decided that more needed to be written about transracial adoption research as a behavior genetic experiment. Arthur Jensen, Richard Lynn, and J. Philippe Rushton have all cited the Minnesota Transracial Adoption Study, as well as several IQ studies of transracially adopted Asians, in support of the hereditarian position. And Richard Nisbett has referenced several other adoption studies that suggest no racial gaps. However, I suspected there was more data for transracially adopted children than what this small cadre of scientists had already discussed (at the very least for important variables other than intelligence); research that could give us a more complete picture of what these unusual children become, and what this can tell us about the causes of ethnic differences in socially valued outcomes.

One way of finding hard to find studies is going thru articles that cite popular reviews of the topic. Sometimes, this is not possible if the reviews have thousands of citations. However, sometimes, it is. In this case, I used the review: Kim, W. J. (1995). International adoption: A case review of Korean children. Child Psychiatry and Human Development, 25(3), 141-154. Then simply look up the studies that cite it (101 results on Scholar). We are looking for studies that report country or area of origin as well as relevant criteria variables such as IQ scores, GPA, educational attainment/achievement, income, socioeconomic status/s factor, crime rates, use of public benefits and so on.

One such study is: This led me to the following study: Lindblad, F., Hjern, A., & Vinnerljung, B. (2003). Intercountry Adopted Children as Young Adults—‐A Swedish Cohort Study. American Journal of Orthopsychiatry, 73(2), 190-202. The abstract is promising:

In a national cohort study, the family and labor market situation, health problems, and education of 5,942 Swedish intercountry adoptees born between 1968 and 1975 were examined and compared with those of the general population, immigrants, and a siblings group—all age matched—in national registers from 1997 to 1999. Adoptees more often had psychiatric problems and were longtime recipients of social welfare. Level of education was on par with that of the general population but lower when adjusted for socioeconomic status.

The sample consists of:

There were 5,942 individuals in the adoptee study group: 3,237 individuals were born in the Far East (2,658 were born in South Korea), 1,422 in South Asia, 871 in Latin America, and 412 in Africa. In the other study groups there were 1,884 siblings, 8,834 European immigrants, 3,544 non-European immigrants, and 723,154 individuals in the general population.

So, by the usual standards, this is a very large study. We are interested in the region of birth results. They are in two tables:

Table 7Table 8

We can note that the Far East — i.e. mostly South Korean, presumably the rest are North Korean (?), Chinese, Japanese, Vietnamese (?) — usually gets the better outcomes. They were less often married, mixed results for living with parents, more likely to have a university degree, less likely to have only primary school, more likely to be in the workforce, less likely to be unemployed, less likely to receive welfare, mixed results for hospital admissions for substance abuse, much like likely to be admitted for alcohol abuse (likely to be due to Asian alcohol flush syndrome), less likely to be admitted for a psychiatric diagnosis, and less likely to receive disability pension.

It would probably have been better if one could aggregate the results and look at the general socioeconomic factor instead. It is not possible to do so with the above results, since there are only 4 cases and 11 variables. One could calculate a score by choosing some or all of the variables. Or one could assign them factor loadings manually and then calculate scores. I calculated a unit-weighted score based on all but the first two indicators. The two indicators that were reversed (uni degree and workforce) were reversed (by 1/OR). I also calculated the median score which is resistant to outliers (e.g. the alcohol abuse indicator). Results:


Socioeconomic outcomes by region of origin, and estimated S scores
Group Latin America Africa South Asia Far East 2.50 1.43 1.67 1
Only.primary.ed 1.60 1.50 1.00 1
Workforce 1.43 1.43 1.11 1
Unemployed 1.30 0.90 1.30 1
Welfare.use 1.90 1.50 1.30 1
Substance.abuse 2.70 0.70 1.00 1
Alcohol.abuse 4.50 4.90 3.60 1
Psychiatric.diag 1.50 1.40 1.20 1
Disability.pension 1.30 1.80 1.30 1
Mean.S 2.08 1.73 1.50 1
Median.S 1.60 1.43 1.30 1


It is interesting to see that the Africans did better than the Latin Americans. Perhaps there is something going on. Perhaps the Latin Americans are from countries with high African% admixture. Or perhaps it’s some kind of selection effect.

In their discussion they write:

There were considerable differences between adoptees from different geographical regions with better outcomes in many respects for children from the Far East, in this context mainly South Korea. Sim­ilar positive adjustment results concerning Asian adoptees have been presented previously. For in­ stance, an excellent prognosis concerning adjustment and identity development in Chinese adoptees in Britain was described (Bagley, 1993). A Dutch group recently presented data about academic achievement and intelligence in 7-year-old children adopted in in­ fancy (Stams, Juffer, Rispens, & Hoksbergen, 2000). The South Korean group had high IQs with 31% above a score of 120. Pre- and postnatal care before adoption seems to be particularly well organized in South Korea (Kim, 1995), which may be one important reason for the positive outcome. The differences among the geographic regions may also, however, be due to a large number of other factors such as differ­ences in nutrition, motives behind the adoption, qual­ity of care in the orphanage-foster home before the adoption, genetic dispositions, and Swedish preju­dices against “foreign-looking” people. Another ex­planation may be a larger number of younger infants in the South Korean group. However, that is not pos­sible to verify from our register data.

The usual cultural explanations.

I have also contacted the Danish statistics office to hear if they have Danish data.


One can study a given human trait at many levels. Probably the most common is the individual level. The next-most common the inter-national, and the least common perhaps the intra-national. This last one can be done at various level too, e.g. state, region, commune, and city. These divisions usually vary by country.

The study of general intelligence (GI) at these higher levels has been called the ecology of intelligence by Richard Lynn (1979, 1980) and the sociology of intelligence by Gottfredson (1998). Lynn’s two old papers cited before actually contain quite a bit of data which can be re-analyzed too. I will do so in a future post. I also decided that this series of posts will have to turn into one big paper with a review and meta-analysis. There are strong patterns in the data not previously explored or synthesized by researchers.

Lynn has published a number of papers on the regions of Italy (2010a, 2010b, 2012 and Lynn and Piffer 2014) and it is this topic I turn to in this post.

Lynn’s 2010 data

True to his style, Lynn (2010a) contains the raw data used for his analysis. It is fortunate because it means anyone can re-analyze them. His paper contains the following variables:

  1. 3x PISA subtests: reading, math, science
  2. An average of these PISA scores
  3. An IQ derived from the average
  4. Stature for 1855, 1910, 1927 and 1980
  5. Per capita income for 1970 and 2003
  6. Infant mortality for 1955 and 1999
  7. Literacy 1880
  8. Years of education 1951, 1971 and 2001
  9. Latitude.

These data are given for 12 Italian regions.

Lynn himself merely did correlational analysis and discussed the results. The data however can be usefully factor analyzed to extract a G (from the three PISA subtests) and S factor (from all the socioeconomic variables). I imported the data into R.

Lynn’s choice of variables is quite odd. They are not all from the same years, presumably because he picked them from various other papers instead of going to the Italian statistics website to fetch some himself. This opens the question of how to analyze them. I did this: I did a factor analysis (MinRes, default settings for fa() from the psych package) on the new socioeconomic data only, old data only, and all of it. The two factor analyses of the limited datasets did not reveal anything interesting not shown in the full analysis, so I only show results from the full analysis. Note that by doing this, I broke the rule of thumb about the number of variables per case (at least 2) because there are 7 variables in my analysis but only 13 cases with full data. The loading plot is:


This plot reveals no surprises.

The loadings for the G factor with the PISA subtests were all .99, so it is pointless to post a plot. The scatter plot for G and S is:


And MCV with reversing:


New data

Being dissatisfied with the data Lynn reported, I decided to collect more data. The PISA 2012 results have PISA scores for more regions than before which allows for an analysis with more cases. This also means that one can use more variables in the factor analysis. The new PISA data has 22 regions, so one can use about 11 variables. However, due to some missing data, only 21 regions were available for analysis (Südtirol had some missing data). So I settled on using 10 variables.

To get data for the analysis, I followed the approach taken in the previous post on the S factor in US states. I went to the official statistics bank, IStat, and fetched data for the regions. Like before, for MCV to work well, one needs a diverse selection of variables, so that there is diversity in their S loadings (not just direction of loading). I settled on the following 10 variables:

  1. Political participation index, 9 years
  2. Percent with normal weight, 9 years
  3. Percent smokers, 10 years
  4. Intentional homicide rate, 4 years
  5. Total crime rate, 4 years
  6. Unemployment, 10 years
  7. Life expectancy males, 10 years
  8. Total fertility rate, 10 years
  9. Interpersonal trust index, 5 years
  10. No savings percent, 10 years

For all variables, I calculated the mean for all years. I fetched the last 10 years for all data when available.

For cognitive data, I fetched the regional scores for reading, mathematics and science subtests from PISA 2012, Annex B2.

Factor analysis

I proceeded like above. The loadings plot is:


There are two odd results. Total crime rate has a slight positive loading (.16) while intentional homicide rate has a strong negative loading (-.72). Lynn reported a similar finding in his 1980 paper on Britain. He explained it as being due to urbanization, which increases population density which increases crime rates (more opportunities, more interpersonal conflicts). An alternative hypothesis is that the total crime rate is being increased by immigrants who live mostly in the north. Perhaps one can get crime rates for natives only to test this. A third hypothesis is that it has to do with differences in the legal system, for instance, prosecutor practice in determining which actions to pull into the legal system.

The second odd result is that fertility has a positive loading. Generally, it has been found that fertility has a slight negative correlation with GI and s factor at the individual level, see e.g. Lynn (2011). It has also been found that internationally, GI has a strong negative relationship, -.5 to -.7 depending on measure, to fertility (Shatz, 2008; Lynn and Harvey, 2008). I also found something similar, -.5, when I examined Danish immigrant groups by country of origin (Kirkegaard, 2014). However, if one examines European countries only, one sees that fertility is relatively ‘high’ (a bit below 2) in the northern countries (Nordic countries, UK), and low in the southern and eastern countries. This means that the correlation of fertility between countries in Europe and IQ (e.g. PISA) is positive. Maybe this has some relevance to the current finding. Maybe immigrants are pulling the fertility up in the northern regions.

There is little to report from the factor analysis of PISA results. All loadings between .98 and .99.

Scatter plot of S and G


MCV with reversing


Inter-dataset scatter plots

To examine the inter-dataset stability of factor scores:

S_S2 G_G2

For one case, the Lynn dataset had data for a merged region. I merged the two regions in the new dataset to match it up against the one from Lynn’s. This is the conservative choice. One could have used Lynn’s data for both regions instead which would have increased the sample size by 1.


The results for the regional G and S in Italian regions is especially strong. They rival even the international S factor in their correlation with the G estimates. Italy really is a very divided country. Stability across datasets was very strong too, so Lynn’s odd choice of data was not inflating the results.

MCV worked better in the dataset with more and more diverse indicator variables for S, as would be expected if the correlation was artificially low in the first dataset due to restriction of range in the S loadings.

Supplementary material

All project files (R source code, data files, plots) are available on the Open Science Framework repository.

Thanks to Davide Piffer for catching an error + help in matching the regions up from the two datasets.


  • Gottfredson, L. S. (1998). Jensen, Jensenism, and the sociology of intelligence. Intelligence, 26(3), 291-299.
  • Kirkegaard, E. O. (2014). Criminality and fertility among danish immigrant populations. Open Differential Psychology.
  • Lynn, R. (1979). The social ecology of intelligence in the British Isles. British Journal of Social and Clinical Psychology, 18(1), 1-12.
  • Lynn, R. (1980). The social ecology of intelligence in France. British Journal of Social and Clinical Psychology, 19(4), 325-331.
  • Lynn, R., & Harvey, J. (2008). The decline of the world’s IQ. Intelligence, 36(2), 112-120.
  • Lynn, R. (2010a). In Italy, north–south differences in IQ predict differences in income, education, infant mortality, stature, and literacy. Intelligence, 38(1), 93-100.
  • Lynn, R. (2010b). IQ differences between the north and south of Italy: A reply to Beraldo and Cornoldi, Belacchi, Giofre, Martini, and Tressoldi. Intelligence, 38(5), 451-455.
  • Lynn, R. (2011). Dysgenics: Genetic deterioration in modern populations. Second edition. Westport CT.
  • Lynn, R. (2012). IQs in Italy are higher in the north: A reply to Felice and Giugliano. Intelligence, 40(3), 255-259.
  • Piffer, D., & Lynn, R. (2014). New evidence for differences in fluid intelligence between north and south Italy and against school resources as an explanation for the north–south IQ differential. Intelligence, 46, 246-249.
  • Shatz, S. M. (2008). IQ and fertility: A cross-national study. Intelligence, 36(2), 109-111.


Introduction and data sources

In my previous two posts, I analyzed the S factor in 33 Indian states and 31 Chinese regions. In both samples I found strongish S factors and they both correlated positively with cognitive estimates (IQ or G). In this post I used cognitive data from McDaniel (2006). He gives two sets of estimated IQs based on SAT-ACT and on NAEP. Unfortunately, they only correlate .58, so at least one of them is not a very accurate estimate of general intelligence.

His article also reports some correlations between these IQs and socioeconomic variables: Gross State Product per capita, median income and percent poverty. However, data for these variables is not given in the article, so I did not use them. Not quite sure where his data came from.

However, with cognitive data like this and the relatively large number of datapoints (50 or 51 depending on use of District of Colombia), it is possible to do a rather good study of the S factor and its correlates. High quality data for US states are readily available, so results should be strong. Factor analysis requires a case to variable ratio of at least 2:1 to deliver reliable results (Zhao, 2009). So, this means that one can do an S factor analysis with about 25 variables.

Thus, I set out to find about 25 diverse socioeconomic variables. There are two reasons to gather a very diverse sample of variables. First, for method of correlated vectors to work (Jensen, 1998), there must be variation in the indicators’ loading on the factor. Lack of variation causes restriction of range problems. Second, lack of diversity in the indicators of a latent variable leads to psychometric sampling error (Jensen, 1994; review post here for general intelligence measures).

My primary source was The 2012 Statistical Abstract website. I simply searched for “state” and picked various measures. I tried to pick things that weren’t too dependent on geography. E.g. kilometer of coast line per capita would be very bad since it’s neither socioeconomic and very dependent (near 100%) on geographical factors. To increase reliability, I generally used all data for the last 10 years and averaged them. Curious readers should see the datafile for details.

I ended up with the following variables:

  1. Murder rate per 100k, 10 years
  2. Proportion with high school or more education, 4 years
  3. Proportion with bachelor or more education, 4 years
  4. Proportion with advanced degree or more, 4 years
  5. Voter turnout, presidential elections, 3 years
  6. Voter turnout, house of representatives, 6 years
  7. Percent below poverty, 10 years
  8. Personal income per capita, 1 year
  9. Percent unemployed, 11 years
  10. Internet usage, 1 year
  11. Percent smokers, male, 1 year
  12. Percent smokers, female, 1 year
  13. Physicians per capita, 1 year
  14. Nurses per capita, 1 year
  15. Percent with health care insurance, 1 year
  16. Percent in ‘Medicaid Managed Care Enrollment’, 1 year
  17. Proportion of population urban, 1 year
  18. Abortion rate, 5 years
  19. Marriage rate, 6 years
  20. Divorce rate, 6 years
  21. Incarceration rate, 2 years
  22. Gini coefficient, 10 years
  23. Top 1%, proportion of total income, 10 years
  24. Obesity rate, 1 year

Most of these are self-explanatory. For the economic inequality measures, I found 6 different measures (here). Since I wanted diversity, I chose the Gini and the top 1% because these correlated the least and are well-known.

Aside from the above, I also fetched the racial proportions for each state, to see how they relate the S factor (and the various measures above, but to get these, run the analysis yourself).

I used R with RStudio for all analyses. Source code and data is available in the supplementary material.

Missing data

In large analyses like this there are nearly always some missing data. The matrixplot() looks like this:


(It does not seem possible to change the font size, so I have cut off the names at the 8th character.)

We see that there aren’t many missing values. I imputed all the missing values with the VIM package (deterministic imputation using multiple regression).

Extreme values

A useful feature of the matrixplot() is that it shows in greytone the relatively outliers for each variable. We can see that some of them have some hefty outliers, which may be data errors. Therefore, I examined them.

The outlier in the two university degree variables is DC, surely because the government is based there and there is a huge lobbyist center. For the marriage rate, the outlier is Nevada. Many people go there and get married. Physician and nurse rates are also DC, same reason (maybe one could make up some story about how politics causes health problems!).

After imputation, the matrixplot() looks like this:


It is pretty much the same as before, which means that we did not substantially change the data — good!

Factor analyzing the data

Then we factor analyze the data (socioeconomic data only). We plot the loadings (sorted) with a dotplot:


We see a wide spread of variable loadings. All but two of them load in the expected direction — positive are socially valued outcomes, negative the opposite — showing the existence of the S factor. The ‘exceptions’ are: abortion rate loading +.60, but often seen as a negative thing. It is however open to discussion. Maybe higher abortion rates can be interpreted as less backward religiousness or more freedom for women (both good in my view). The other is marriage rate at -.19 (weak loading). I’m not sure how to interpret that. In any case, both of these are debatable which way the proper desirable direction is.

Correlations with cognitive measures

And now comes the big question, does state S correlate with our IQ estimates? They do, the correlations are: .14 (SAT-ACT) and .43 (NAEP). These are fairly low given our expectations. Perhaps we can work out what is happening if we plot them:


Now we can see what is going on. First, the SAT-ACT estimates are pretty strange for three states: California, Arizona and Nevada. I note that these are three adjacent states, so it is quite possibly some kind of regional testing practice that’s throwing off the estimates. If someone knows, let me know. Second, DC is a huge outlier in S, as we may have expected from our short discussion of extreme values above. It’s basically a city state which is half-composed of low s (SES) African Americans and half upper class related to government.

Dealing with outliers – Spearman’s correlation aka. rank-order correlation

There are various ways to deal with outliers. One simple way is to convert the data into ranked data, and just correlate those like normal. Pearson’s correlations assume that the data are normally distributed, which is often not the case with higher-level data (states, countries). Using rank-order gets us these:

S_IQ1_rank S_IQ2_rank

So the correlations improved a lot for the SAT-ACT IQs and a bit for the NAEP ones.

Results without DC

Another idea is simply excluding the strange DC case, and then re-running the factor analysis. This procedure gives us these loadings:


(I have reversed them, because they were reversed e.g. education loading negatively.)

These are very similar to before, excluding DC did not substantially change results (good). Actually, the factor is a bit stronger without DC throwing off the results (using minres, proportion of var. = 36%, vs. 30%). The reason this happens is that DC is an odd case, scoring very high in some indicators (e.g. education) and very poorly in others (e.g. murder rate).

The correlations are:


So, not surprisingly, we see an increase in the effect sizes from before: .14 to .31 and .43 to .69.

Without DC and rank-order

Still, one may wonder what the results would be with rank-order and DC removed. Like this:


So compared to before, effect size increased for the SAT-ACT IQ and decreased slightly for the NAEP IQ.

Now, one could also do regression with weights based on some metric of the state population and this may further change results, but I think it’s safe to say that the cognitive measures correlate in the expected direction and with the removal of one strange case, the better measure performs at about the expected level with or without using rank-order correlations.

Method of correlated vectors

The MCV (Jensen, 1998) can be used to test whether a specific latent variable underlying some data is responsible for the observed correlation between the factor score (or factor score approximation such as IQ — an unweighted sum) and some criteria variable. Altho originally invented for use on cognitive test data and the general intelligence factor, I have previously used it in other areas (e.g. Kirkegaard, 2014). I also used it in the previous post on the S factor in India (but not China because there was a lack of variation in the loadings of socioeconomic variables on the S factor).

Using the dataset without DC, the MCV result for the NAEP dataset is:


So, again we see that MCV can reach high r’s when there is a large number of diverse variables. But note that the value can be considered inflated because of the negative loadings of some variables. It is debatable whether one should reverse them.

Racial proportions of states and S and IQ

A last question is whether the states’ racial proportions predict their S score and their IQ estimate. There are lots of problems with this. First, the actual genomic proportions within these racial groups vary by state (Bryc, 2015). Second, within ‘pure-breed’ groups, general intelligence varies by state too (this was shown in the testing of draftees in the US in WW1). Third, there is an ‘other’ group that also varies from state to state, presumably different kinds of Asians (Japanese, Chinese, Indians, other SE Asia). Fourth, it is unclear how one should combine these proportions into an estimate used for correlation analysis or model them. Standard multiple regression is unsuited for handling this kind of data with a perfect linear dependency, i.e. the total proportion must add up to 1 (100%). MR assumes that the ‘independent’ variables are.. independent of each other. Surely some method exists that can handle this problem, but I’m not familiar with it. Given the four problems above, one will not expect near-perfect results, but one would probably expect most going in the right direction with non-near-zero size.

Perhaps the simplest way of analyzing it is correlation. These are susceptible to random confounds when e.g. white% correlates differentially with the other racial proportions. However, they should get the basic directions correct if not the effect size order too.

Racial proportions, NAEP IQ and S

For this analysis I use only the NAEP IQs and without DC, as I believe this is the best subdataset to rely on. I correlate this with the S factor and each racial proportion. The results are:

Racial group NAEP IQ S
White 0.69 0.18
Black -0.5 -0.42
Hispanic -0.38 -0.08
Other -0.26 0.2


For NAEP IQ, depending on what one thinks of the ‘other’ category, these have either exactly or roughly the order one expects: W>O>H>B. If one thinks “other” is mostly East Asian (Japanese, Chinese, Korean) with higher cognitive ability than Europeans, one would expect O>W>H>B. For S, however, the order is now O>W>H>B and the effect sizes much weaker. In general, given the limitations above, these are perhaps reasonable if somewhat on the weak side for S.

Estimating state IQ from racial proportions using racial IQs

One way to utilize all the four variable (white, black, hispanic and other) without having MR assign them weights is to assign them weights based on known group IQs and then calculate a mean estimated IQ for each state.

Depending on which estimates for group IQs one accepts, one might use something like the following:

State IQ est. = White*100+Other*100+Black*85+Hispanic*90

Or if one thinks other is somewhat higher than whites (this is not entirely unreasonable, but recall that the NAEP includes reading tests which foreigners and Asians perform less well on), one might want to use 105 for the other group (#2). Or one might want to raise black and hispanic IQs a bit, perhaps to 88 and 93 (#3). Or do both (#4) I did all of these variations, and the results are:

Variable Race.IQ Race.IQ2 Race.IQ3 Race.IQ4
Race.IQ 1 0.96 1 0.93
Race.IQ2 0.96 1 0.96 0.99
Race.IQ3 1 0.96 1 0.94
Race.IQ4 0.93 0.99 0.94 1
NAEP IQ 0.67 0.56 0.67 0.51
S 0.41 0.44 0.42 0.45


As far as I can tell, there is no strong reason to pick any of these over each other. However, what we learn is that the racial IQ estimate and NAEP IQ estimate is somewhere between .51 and .67, and the racial IQ estimate and S is somewhere between .41 and .45. These are reasonable results given the problems of this analysis described above I think.

Added March 11: New NAEP data

I came across a series of posts by science blogger The Audacious Epigone, who has also estimated IQs based on NAEP data. He has done this three times (for 2013, 2009 and 2005 data), so along with McDaniels estimates, this gives us 4 non-identical estimates. First, we check their intercorrelations, which should be very high, r>.9, for this kind of data. Second, we extract the general factor and use it as the best estimate of NAEP IQ for the states (I deleted DC again). Third, we see how all 5 variables relate to S from before.


NAEP.IQ.09 0.96        
NAEP.IQ.05 0.83 0.89      
NAEP M. 0.88 0.93 0.96    
NAEP.1 0.95 0.99 0.95 0.97  
S 0.81 0.76 0.64 0.69 0.75


Where NAEP.1 is the general NAEP factor. We see that intercorrelations between NAEP estimates are not that high, they average only .86. Their loadings on the common factor is very high tho, .95 to .99. Still, this should result in improved results due to measurement error. And it does, NAEP IQ x S is now .75 from .69.

Scatter plot



Supplementary material

Data files and R source code available on the Open Science Framework repository.


Bryc, K., Durand, E. Y., Macpherson, J. M., Reich, D., & Mountain, J. L. (2015). The Genetic Ancestry of African Americans, Latinos, and European Americans across the United States. The American Journal of Human Genetics, 96(1), 37-53.

Jensen, A. R., & Weng, L. J. (1994). What is a good g?. Intelligence, 18(3), 231-258.

Jensen, A. R. (1998). The g factor: The science of mental ability. Westport, CT: Praeger.

Kirkegaard, E. O. W. (2014). The international general socioeconomic factor: Factor analyzing international rankings. Open Differential Psychology.

McDaniel, M. A. (2006). State preferences for the ACT versus SAT complicates inferences about SAT-derived state IQ estimates: A comment on Kanazawa (2006). Intelligence, 34(6), 601-606.

Zhao, N. (2009). The Minimum Sample Size in Factor Analysis.


Richard Lynn has been publishing a number of papers on IQ in regions/areas/states within countries along with various socioeconomic correlates. However, usually his and co-authors analysis is limited to reporting the correlation matrix. This is a pity, because the data allow for a more interesting analysis with the S factor (see Kirkegaard, 2014). I have previously re-analyzed Lynn and Yadav (2015) in a blogpost to be published in a journal sometime ‘soon’. In this post I re-analyze the data reported in Lynn and Cheng (2013) as well as more data I obtained from the official Chinese statistical database.

Data sources

In their paper, they report 6 variables: 1) IQ, 2) sample size for IQ measurement, 3) % of population Ethnic Han, 4) years of education, 5) percent of higher education (percent with higher education?), and 6) GDP per capita. This only includes 3 socioeconomic variables — the bare minimum for S factor analyses — so I decided to see if I could find some more.

I spent some time on the database and found various useful variables:

  • Higher education per capita for 6 years
  • Medical technical personnel for 5 years
  • Life expectancy for 1 year
  • Proportion of population illiterate for 9 years
  • Internet users per capita for 10 years
  • Invention patents per capita for 10 years
  • Proportion of population urban for 9 years
  • Scientific personnel for 8 years

I used all available data for the last 10 years in all cases. This was done to increase reliability of the measurement, just in case there was some and reduce transient effects. In general tho regional differences were very consistent thruout the years, so this had little effect. One could do factor analysis and get the factor scores, but this would make the score hard to understand for the reader.

For the variable with data for multiple years, I calculated the average yearly intercorrelation to see how reliable the measure were. In all but one case, the average intercorrelation was >=.94 and the last case it was .86. There would be little to gain from factor analyzing these data and using the scores instead of just averaging the years preserves interpretable data. Thus, I averaged the variables for each year to produce one variable. This left me with 11 socioeconomic variables.

Examining the S factor and MCV

Next step was to factor analyze the 11 variables and see if one general factor emerged with the right direction of loadings. It did in fact, the loadings are as follows:


All the loadings are in the expected direction. Aside from the one negative loading (illiteracy), they are all fairly strong. This means that MCV (method of correlated vectors) analysis is rather useless, since there is little inter-loading variation. One could probably fix this by going back to the databank and fetching some variables that are worse measures of S and that varies more.

Doing the MCV anyway results in r=.89 (inflated by the one negative loading). Excluding the negative loading gives r=.38, which is however solely due to the scientific personnel datapoint. To properly test it, one needs to fetch more data that varies more in its S loading.


S and, IQ and Han%

We are now ready for the two main results, i.e. correlation of S with IQs and % ethnic Han.

S_Han S_IQ

Correlations are of moderate strength, r.=.42 and r=.48. This is somewhat lower than found in analyses of Danish and Norwegian immigrant groups (Kirkegaard 2014, r’s about .55) and much lower than that found between countries (r=.86) and lower than that found in India (r=.61). The IQ result is mostly due to the two large cities areas of Beijing and Shanghai, so the results are not that convincing. But they are tentative and consistent with previous results.

Han ethnicity seems to be a somewhat more reasonable predictor in this dataset. It may not be due to higher general intelligence, they may have some other qualities that cause them to do well. Perhaps more conscientious, or more rule-conforming which is arguably rather important in authoritarian societies like China.

Supplementary material

The R code and datasets are available at the Open Science Foundation repository for this study.



Differences in cognitive ability, per capita income, infant mortality, fertility and latitude across the states of India (test)

Richard Lynn and Prateek Yadav (2015) have a new paper out in Intelligence reporting various cognitive measures, socioeconomic outcomes and environmental factors in some 33 states and areas of India. Their analyses consist entirely of reporting the correlation matrix, but they list the actual data in two tables as well. This means that someone like me can reanalyze it.

They have data for the following variables:


Language Scores Class III (T1). These data consisted of the language scores of class III 11–12 year old school students in the National Achievement Survey (NAS) carried out in Cycle-3 by the National Council of Educational Research and Training (2013). The population sample comprised 104,374 students in 7046 schools across 33 states and union territories (UTs). The sample design for each state and UT involved a three-stage cluster design which used a combination of two probability sampling methods. At the first stage, districts were selected using the probability proportional to size (PPS) sampling principle in which the probability of selecting a particular district depended on the number of class 5 students enrolled in that district. At the second stage, in the chosen districts, the requisite number of schools was selected. PPS principles were again used so that large schools had a higher probability of selection than smaller schools. At the third stage, the required number of students in each school was selected using the simple random sampling (SRS) method. In schools where class 5 had multiple sections, an extra stage of selection was added with one section being sampled at random using SRS.

The language test consisted of reading comprehension and vocabulary, assessed by identifying the word for a picture. The test contained 50 items and the scores were analyzed using both Classical Test Theory (CTT) and Item Response Theory (IRT). The scores were transformed to a scale of 0–500 with a mean of 250 and standard deviation of 50. There were two forms of the test, one in English and the other in Hindi.


Mathematics Scores Class III (T2). These data consisted of the mathematics scores of Class III school students obtained by the same sample as for the Language Scores Class III described above. The test consisted of identifying and using numbers, learning and understanding the values of numbers (including basic operations), measurement, data handling, money, geometry and patterns. The test consisted of 50 multiple-choice items scored from 0 to 500 with a mean score was set at 250 with a standard deviation of 50.


Language Scores Class VIII (T3). These data consisted of the language scores of class VIII (14–15 year olds) obtained in the NAS (National Achievement Survey) a program carried out by the National Council of Educational Research and Training, 2013) Class VIII (Cycle-3).The sampling methodology was the same as that for class III described above. The population sample comprised 188,647 students in 6722 schools across 33 states and union territories. The test was a more difficult version of that for class III, and as for class III, scores were analyzed using both Classical Test Theory (CTT) and Item Response Theory (IRT), and were transformed to a scale of 0–500 with a mean 250.


Mathematics Scores Class VIII (T4). These data consisted of the mathematics scores of Class VIII (14–15 year olds) school students obtained by the same sample as for the Language Scores Class VIII described above. As with the other tests, the scores were transformed to a scale of 0–500 with a mean 250 and standard deviation of 50.


Science Scores Class VIII (T5). These data consisted of the science scores of Class VIII (14–15 year olds) school students obtained by the same sample as for the Language Scores Class VIII described above. As with the other tests, the scores were transformed to a scale of 0–500 with a mean 250 and standard deviation of 50. The data were obtained in 2012.


Teachers’ Index (TI). This index measures the quality of the teachers and was taken from the Elementary State Education Report compiled by the District Information System for Education (DISE, 2013). The data were recorded in September 2012 for teachers of grades 1–8 in 35 states and union territories. The sample consisted of 1,431,702 schools recording observations from 199.71 million students and 7.35 million teachers. The teachers’ Index is constructed from the percentages of schools with a pupil–teacher ratio in primary greater than 35, and the percentages single-teacher schools, teachers without professional qualification, and female teachers (in schools with 2 and more teachers).


Infrastructure Index (II). These data were taken from the Elementary State Education Report 2012–13 compiled by the District Information System for Education (2013). The sample was the same as for the Teachers’ Index described above. This index measures the infrastructure for education and was constructed from the percentages of schools with proper chairs and desks, drinking water, toilets for boys and girls, and with kitchens.


GDP per capita (GDP per cap). These data are the net state domestic product of the Indian states in 2008–09 at constant prices given by the Reserve Bank of India (2013). Data are not available for the Union Territories.


Literacy Rate (LR). This consists of the percentage of population aged 7 and above in given in the 2011 census published by the Registrar General and Census Commission of India (2011).


Infant Mortality Rate (IMR). This consists of the number of deaths of infants less than one year of age per 1000 live births in 2005–06 given in the National Family Health Survey, Infant and Child Mortality given by the Indian Institute of Population Sciences (2006).


Child Mortality Rate (CMR). This consists of the number of deaths of children 1–4 years of age per 1000 live births in the 2005–06 given by the Indian Institute of Population Sciences (2006).


Life Expectancy (LE). This consists of the number of years an individual is expected to live after birth, given in a 2007 survey carried out by Population Foundation of India (2008).


Fertility Rate (FR). This consists of the number of children born per woman in each state and union territories in 2012 given by Registrar General and Census Commission of India (2012).


Latitude (LAT). This consists of the latitude of the center of the state.


Coast Line (CL). This consists of whether states have a coast line or are landlocked and is included to examine whether the possession of a coastline is related to the state IQs.


Percentage of Muslims (MS). This is included to examine a possible relation to the state IQs.


This article will include the R code line for line commented as a helping exercise for readers not familiar with R but who can perhaps be convinced to give it a chance! :)

library(devtools) #source_url
source_url("") #mega functions from OSF
library(psych) #various
library(car) #scatterplot
library(Hmisc) #rcorr
library(VIM) #imputation

This loads a variety of libraries that are useful.

Getting the data into R

cog = read.csv("Lynn_table1.csv",skip=2,header=TRUE,row.names = 1) #load cog data
socio = read.csv("Lynn_table2.csv",skip=2,header=TRUE,row.names = 1) #load socio data

The files are the two files one can download from ScienceDirect: Lynn_table1 Lynn_table2 The code makes it read it assuming values are divided by comma (CSV = comma-separated values), skips the first two lines because they do not contain data, loads the first line as headers, and uses the first column as rownames.

Merging data into one object

Ideally, I’d like all the data as one object for easier use. However, since it comes it two, it has to be merged. For this purpose, I rely upon a dataset merger function I wrote some months ago to handle international data. It can however handle any merging of data where one wants to match rows by name from different datasets and combine them into one dataset. This function, merge_datasets(), is found in the mega_functions we imported earlier.

However, first, it is a good idea to make sure the names do match when they are supposed to. To check this we can type:


I put the output into Excel to check for mismatches:

Andhra Pradesh Andhra Pradesh TRUE
Arunachal Pradesh Arunachal Pradesh TRUE
Bihar Bihar TRUE
Chattisgarh Chattisgarh TRUE
Goa Goa TRUE
Gujarat Gujarat TRUE
Haryana Haryana TRUE
Himanchal Pradesh Himanchal Pradesh TRUE
Jammu Kashmir Jammu & Kashmir FALSE
Jharkhand Jharkhand TRUE
Karnataka Karnataka TRUE
Kerala Kerala TRUE
Madhya Pradesh Madhya Pradesh TRUE
Maharashtra Maharashtra TRUE
Manipur Manipur TRUE
Meghalaya Meghalaya TRUE
Mizoram Mizoram TRUE
Nagaland Nagaland TRUE
Odisha Odisha TRUE
Punjab Punjab TRUE
Rajashthan Rajasthan FALSE
Sikkim Sikkim TRUE
Tamil Nadu TamilNadu FALSE
Tripura Tripura TRUE
Uttarkhand Uttarkhand TRUE
Uttar Pradesh Uttar Pradesh TRUE
West Bengal West Bengal TRUE
A & N Islands A & N Islands TRUE
Chandigarh Chandigarh TRUE
D & N Haveli D & N Haveli TRUE
Daman & Diu Daman & Diu TRUE
Delhi Delhi TRUE
Puducherry Puducherry TRUE


So we see that the order is the same, however, we see that there are three that doesn’t match despite being supposed to. We can fix this discrepancy by using the rownames of one dataset for the other:

rownames(cog) = rownames(socio) #use rownames from socio for cog

This makes the rownames of cog the same as those for socio. Now they are ready for merging.

Incidentally, since the order is the same, we could have simply merged with the command:

cbind(cog, socio)

However it is good to use merge_datasets() since it is so much more generally useful.

Missing and broken data

Next up, we examine missing data and broken data.

#examine missing data

The first, miss.table(), is another custom function from mega_functions. It outputs the number of missing values per variable. The outputs are:

  0   0   0   6   0   4   0   0   0   0   0
T1 T2 T3 T4 T5 CA 
 0  0  0  0  0  0

So we see that there are 10 missing values in the socio and 0 in cog.

Next we want to see how these are missing. We can do this e.g. by plotting it with a nice function like matrixplot() (from VIM) or by tabling the missing cases. Output:

 0  1  2 
27  2  4


So we see that there are a few cases that miss data from 1 or 2 variables. Nothing serious.

One could simply ignore this, but that would be not utilizing the data to the full extent possible. The correct solution is to impute data rather than removing cases with missing data.

However, before we do this, look at the TI variable above. The greyscale shows the standardized values of the datapoints. So in this variable we see that there is one very strong outlier. If we take a look back at the data table, we see that it is likely an input error. All the other datapoints have values between 0 and 1, but the one for Uttarkhand has 247,200.595… I don’t see how the input error happened the so best way is to remove it:

#fix broken datapoint
socio["Uttarkhand","TI"] = NA

Then, we impute the missing data in the socio variable:

#impute data
socio2 = irmi(socio, noise.factor = 0) #no noise

The second parameter is used for multiple imputation, which we don’t use here. Setting it as 0 means that the imputation is deterministic and hence exactly reproducible for other researchers.

Finally, we can compare the non-imputed dataset to the imputed one:

#compare desp stats
round(describe(socio)-describe(socio2),2) #discrepancy values, rounded

The output is large, so I won’t show it here, but it shows that the means, sd, range, etc. of the variables with and without imputation are similar which means that we didn’t completely mess up the data by the procedure.

Finally, we merge the data to one dataset:

#merge data
data = merge.datasets(cog,socio2,1) # merge above

Next, we want to do factor analysis to extract the general socioeconomic factor and the general intelligence factor from their respective indicators. And then we add them back to the main dataset:

#factor analysis
fa = fa(data[1:5]) #fa on cognitive data
data["G"] = as.numeric(fa$scores)

fa2 = fa(data[7:14]) #fa on SE data
data["S"] = as.numeric(fa2$scores)

Columns  1-5 are the 5 cognitive measures. Cols 7:14 are the socioeconomic ones. One can disagree about the illiteracy variable, which could be taken as belonging to cognitive variables, not the socioeconomic ones. It is similar to the third cognitive variable which is some language test. I follow the practice of the authors.

The output from the first factor analysis is:

    MR1     h2   u2 com
T1 0.40 0.1568 0.84   1
T2 0.10 0.0096 0.99   1
T3 0.46 0.2077 0.79   1
T4 0.93 0.8621 0.14   1
T5 0.92 0.8399 0.16   1
Proportion Var 0.42

This is using the default settings, which is minimum residuals. Since the method used typically does not matter except for PCA on small datasets, this is fine.

All loadings are positive as expected, but T2 is only slightly so.

We put the factor scores back into the dataset and call it “G” (Rindermann, 2007).

The factor analysis output for socioeconomic variables is:

      MR1    h2   u2 com
Lit  0.79 0.617 0.38   1
II   0.36 0.128 0.87   1
TI   0.91 0.824 0.18   1
GDP  0.76 0.579 0.42   1
IMR -0.92 0.842 0.16   1
CMR -0.85 0.721 0.28   1
FER -0.84 0.709 0.29   1
LE   0.14 0.019 0.98   1
Proportion Var 0.55

Strong positive loadings for: proportion of population literate (LIT), teacher index (TI), GDP, medium positive for infrastructure index (II), weak positive for life expectancy (LE). Strong negative for infant mortality rate (IMR), child mortality rate (CMR) and fertility. All of these are in the expected direction.

Then we extract the factor scores and add them back to the dataset and call them “S”.


Finally, we want to check out the correlations with G and S.

#Pearson results
results = rcorr2(data)
View(results$r)  #view all correlations
results$r[,18:19] #S and G correlations
results$n #sample size

results.s = rcorr2(data, type="spearman") #spearman
View(results.s$r) #view all correlations

results.c = results$r-results.s$r

We look at both the Pearson and Spearman correlations because data may not be normal and may have outliers. Spearman’s is resistant to these problems. The discrepancy values are how larger the Pearson is than the Spearman.

There are too many correlations to output here, so we focus on those involving G and S (columns 18:19).

 Variable G S
T1 0.41 0.41
T2 0.10 -0.39
T3 0.48 0.16
T4 0.97 0.62
T5 0.96 0.53
CA 0.87 0.38
Lit 0.66 0.81
II 0.45 0.37
TI 0.40 0.93
GDP 0.40 0.78
IMR -0.60 -0.94
CMR -0.54 -0.87
FER -0.56 -0.86
LE 0.01 0.14
LAT -0.53 -0.34
CL -0.63 -0.54
MS -0.24 -0.08
G 1.00 0.59
S 0.59 1.00


So we see that G and S correlate at .59, fairly high and similar to previous within country results with immigrant groups (.54 in Denmark, .59 in Norway Kirkegaard (2014a), Kirkegaard and Fuerst (2014)) but not quite as high as the between country results (.86-.87 Kirkegaard (2014b)). Lynn and Yadav mention that data exists for France, Britain and the US. These can serve for reanalysis with respect to S factors at the regional/state level.

Finally, we want may to plot the main result:

title = paste0("India: State G-factor correlates ",round(results$r["S","G"],2)," with state S-factor, N=",results$n["S","G"])
scatterplot(S ~ G, data, smoother=FALSE, id.n=nrow(data),
            xlab = "G, extracted from 5 indicators",
            ylab = "S, extracted from 11 indicates",
            main = title)


It would be interesting if one could obtain genomic admixture measures for each state and see how they relate, since this has been found repeatedly elsewhere and is a strong prediction from genetic theory.


Lynn has sent me the correct datapoint. It is 0.595. The imputed value was around .72. I reran the analysis with this value and imputed the rest. It doesn’t change much. The new results are slightly stronger.

  New results   Discrepancy scores
T1 0.41 0.42 0.00 -0.01
T2 0.10 -0.37 0.00 -0.02
T3 0.48 0.18 0.00 -0.02
T4 0.97 0.63 0.00 -0.02
T5 0.96 0.54 0.00 -0.01
CA 0.87 0.40 0.00 -0.02
Lit 0.66 0.81 0.00 -0.01
II 0.45 0.37 0.00 0.00
TI 0.42 0.92 -0.02 0.01
GDP 0.40 0.78 0.00 0.00
IMR -0.60 -0.95 0.00 0.00
CMR -0.54 -0.87 0.00 0.00
FER -0.56 -0.86 0.00 -0.01
LE 0.01 0.14 0.00 0.00
LAT -0.53 -0.35 0.00 0.01
CL -0.63 -0.54 0.00 0.00
MS -0.24 -0.09 0.00 0.00
G 1.00 0.61 0.00 -0.01
S 0.61 1.00 -0.01 0.00


Method of correlated vectors

This study is special in that we have two latent variables each with its own set of indicator variables. This means that we can use Jensen’s method of correlated vectors (MCV; Jensen (1998)), and also a new version which I shall creatively dub “double MCV”, DMCV using both latent factors instead of only one.

The method consists of correlating the factor loadings of a set of indicator variables for a factor with the correlations of each indicator variable with a criteria variable. Jensen used this with the general intelligence factor (g-factor) and its subtests with criteria variables such as inbreeding depression in IQ scores and brain size.

So, to do regular MCV in this study, we first choose either the S and G factor. Then we correlate the loadings of each indicator with its correlation with the criteria variable, i.e. the S/G factor we didn’t choose.

Doing this analysis is in fact very easy here, because the results reported in the table above with S and G is exactly that which we need to correlate.

## MCV
#Double MCV
#MCV on G
#MCV on S

The results are: .87, .89, and .97. In other words, MCV gives a strong indication that it is the latent traits that are responsible for the observed correlations.


Jensen, A. R. (1998). The g factor: The science of mental ability. Westport, CT: Praeger.

Kirkegaard, E. O. W. (2014a). Crime, income, educational attainment and employment among immigrant groups in Norway and Finland. Open Differential Psychology.

Kirkegaard, E. O. W., & Fuerst, J. (2014). Educational attainment, income, use of social benefits, crime rate and the general socioeconomic factor among 71 immigrant groups in Denmark. Open Differential Psychology.

Kirkegaard, E. O. W. (2014b). The international general socioeconomic factor: Factor analyzing international rankings. Open Differential Psychology.

Lynn, R., & Yadav, P. (2015). Differences in cognitive ability, per capita income, infant mortality, fertility and latitude across the states of India. Intelligence, 49, 179-185.

Rindermann, H. (2007). The g‐factor of international cognitive ability comparisons: The homogeneity of results in PISA, TIMSS, PIRLS and IQ‐tests across nations. European Journal of Personality, 21(5), 667-706.

So, she came up with:

So I decided to try it out, since I’m taking a break from reading Lilienfeld which I had been doing that for 5 hours straight or so.

So the question is whether inbreeding measures have incremental validity over IQ and Islam, which I have previously used to examine immigrant performance in a number of studies.

So, to get the data into R, I OCR’d the PDF in Abbyy FineReader since this program allows for easy copying of table data by row or column. I only wanted column 1-2 and didn’t want to deal with the hassle of importing it with spreadsheet problems (which need a consistent separator, e.g. comma or space). Then I merged it with the megadataset to create a new version, 2.0d.

Then I created a subset of the data with variables of interest, and renamed them (otherwise results would be unwieldy). Intercorrelations are:

row.names Cousin% CoefInbreed IQ Islam
1 Cousin% 1.00 0.52 -0.59 0.78 -0.76
2 CoefInbreed 0.52 1.00 -0.28 0.40 -0.55
3 IQ -0.59 -0.28 1.00 -0.27 0.54
4 Islam 0.78 0.40 -0.27 1.00 -0.71
5 -0.76 -0.55 0.54 -0.71 1.00


Spearman’ correlations, which are probably better due to the non-normal data:

row.names Cousin% CoefInbreed IQ Islam
1 Cousin% 1.00 0.91 -0.63 0.67 -0.73
2 CoefInbreed 0.91 1.00 -0.55 0.61 -0.76
3 IQ -0.63 -0.55 1.00 -0.23 0.72
4 Islam 0.67 0.61 -0.23 1.00 -0.61
5 -0.73 -0.76 0.72 -0.61 1.00


The fairly high correlations of inbreeding measures with IQ and Islam mean that their contribution will likely be modest as incremental validity.

However, let’s try modeling them. I create 7 models of interest and compile the primary measure of interest from them, R2 adjusted, into an object. Looks like this:

row.names R2 adj.
1 ~ IQ+Islam 0.5472850
2 ~ IQ+Islam+CousinPercent 0.6701305
3 ~ IQ+Islam+CoefInbreed 0.7489312
4 ~ Islam+CousinPercent 0.6776841
5 ~ Islam+CoefInbreed 0.7438711
6 ~ IQ+CousinPercent 0.5486674
7 ~ IQ+CoefInbreed 0.4979552


So we see that either of them adds a fair amount of incremental validity to the base model (line 1 vs. 2-3). They are in fact better than IQ if one substitutes them in (1 vs. 4-5). They can also substitute for Islam, but only with about the same predictive power (1 vs 6-7).

Replication for Norway

Replication for science is important. Let’s try Norwegian data. The Finnish and Dutch data are well-suited for this (too few immigrant groups, few outcome variables i.e. only crime)

Pearson intercorrelations:

row.names CousinPercent CoefInbreed IQ Islam
1 CousinPercent 1.00 0.52 -0.59 0.78 -0.78
2 CoefInbreed 0.52 1.00 -0.28 0.40 -0.46
3 IQ -0.59 -0.28 1.00 -0.27 0.60
4 Islam 0.78 0.40 -0.27 1.00 -0.72
5 -0.78 -0.46 0.60 -0.72 1.00



row.names CousinPercent CoefInbreed IQ Islam
1 CousinPercent 1.00 0.91 -0.63 0.67 -0.77
2 CoefInbreed 0.91 1.00 -0.55 0.61 -0.71
3 IQ -0.63 -0.55 1.00 -0.23 0.75
4 Islam 0.67 0.61 -0.23 1.00 -0.47
5 -0.77 -0.71 0.75 -0.47 1.00


These look fairly similar to Denmark.

And the regression results:

row.names R2 adj.
1 ~ IQ+Islam 0.5899682
2 ~ IQ+Islam+CousinPercent 0.7053999
3 ~ IQ+Islam+CoefInbreed 0.7077162
4 ~ Islam+CousinPercent 0.6826272
5 ~ Islam+CoefInbreed 0.6222364
6 ~ IQ+CousinPercent 0.6080922
7 ~ IQ+CoefInbreed 0.5460777


Fairly similar too. If added, they have incremental validity (line 1 vs. 2-3). They perform better than IQ if substituted but not as much as in the Danish data (1 vs. 4-5). They can also substitute for Islam (1 vs. 6-7).

How to interpret?

Since inbreeding does not seem to have any direct influence on behavior that is reflected in the S factor, it is not so easy to interpret these findings. Inbreeding leads to various health problems and lower g in offspring, the latter which may have some effect. However, presumably, national IQs already reflect the lowered IQ from inbreeding, so there should be no additional effect there beyond national IQs. Perhaps inbreeding results in other psychological problems that are relevant.

Another idea is that inbreeding rates reflect non-g psychological traits that are relevant to adapting to life in Denmark. Perhaps it is a useful measure of clanishness, would be reflected in hostility towards integration in Danish society (such as getting an education, or lack of sympathy/antipathy towards ethnic Danes and resulting higher crime rates against them), which would be reflected in the S factor.

The lack of relatively well established causal routes for interpreting the finding makes me somewhat cautious about how to interpret this.


##Code for mergining cousin marriage+inbreeding data with megadataset
inbreed = read.table("clipboard", sep="\t",header=TRUE, row.names=1) #load data from clipboard
source("merger.R") #load mega functions
mega20d = read.mega("Megadataset_v2.0d.csv") #load latest megadataset
names = as.abbrev(rownames(inbreed)) #get abbreviated names
rownames(inbreed) = names #set them as rownames

#merge and save
mega20e = merge.datasets(mega20d,inbreed,1) #merge to create v. 2.0e
write.mega(mega20e,"Megadataset_v2.0e.csv") #save it

#select subset of interesting data = subset(mega20e, selec=c("Weighted.mean.consanguineous.percentage.HobenEtAl2010",
colnames( = c("CousinPercent","CoefInbreed","IQ","Islam","") #shorter var names
rcorr = rcorr(as.matrix( #correlation object
View(round(rcorr$r,2)) #view correlations, round to 2
rcorr.S = rcorr(as.matrix(,type = "spearman") #spearman correlation object
View(round(rcorr.S$r,2)) #view correlations, round to 2

#Multiple regression
library(QuantPsyc) #for beta coef
results = = NA, nrow=0, ncol = 1)) #empty matrix for results
colnames(results) = "R2 adj."
models = c(" ~ IQ+Islam", #base model,
           " ~ IQ+Islam+CousinPercent", #1. inbreeding var
           " ~ IQ+Islam+CoefInbreed", #2. inbreeding var
           " ~ Islam+CousinPercent", #without IQ
           " ~ Islam+CoefInbreed", #without IQ
           " ~ IQ+CousinPercent", #without Islam
           " ~ IQ+CoefInbreed") #without Islam

for (model in models){ #run all the models
  fit.model = lm(model, #fit model
  sum.stats = summary(fit.model) #summary stats object
  summary(fit.model) #summary stats
  lm.beta(fit.model) #standardized betas
  results[model,] = sum.stats$adj.r.squared #add result to results object
View(results) #view results

##Let's try Norway too = subset(mega20e, selec=c("Weighted.mean.consanguineous.percentage.HobenEtAl2010",

colnames( = c("CousinPercent","CoefInbreed","IQ","Islam","") #shorter var names
rcorr = rcorr(as.matrix( #correlation object
View(round(rcorr$r,2)) #view correlations, round to 2
rcorr.S = rcorr(as.matrix(,type = "spearman") #spearman correlation object
View(round(rcorr.S$r,2)) #view correlations, round to 2

results = = NA, nrow=0, ncol = 1)) #empty matrix for results
colnames(results) = "R2 adj."
models = c(" ~ IQ+Islam", #base model,
           " ~ IQ+Islam+CousinPercent", #1. inbreeding var
           " ~ IQ+Islam+CoefInbreed", #2. inbreeding var
           " ~ Islam+CousinPercent", #without IQ
           " ~ Islam+CoefInbreed", #without IQ
           " ~ IQ+CousinPercent", #without Islam
           " ~ IQ+CoefInbreed") #without Islam

for (model in models){ #run all the models
  fit.model = lm(model, #fit model
  sum.stats = summary(fit.model) #summary stats object
  summary(fit.model) #summary stats
  lm.beta(fit.model) #standardized betas
  results[model,] = sum.stats$adj.r.squared #add result to results object
View(results) #view results

There was some talk on Twitter around prison rates and inequality:

And IQ and inequality:

But then what about prison data beyond those given above? I have downloaded the newest data from here ICPS (rate data, not totals).

Now, what about all three variables?

#load mega20d as the datafile
ineqprisoniq = subset(mega20d, select=c("Fact1_inequality","LV2012estimatedIQ","PrisonRatePer100000ICPS2015"))
rcorr(as.matrix(ineqprisoniq),type = "spearman")
                            Fact1_inequality LV2012estimatedIQ PrisonRatePer100000ICPS2015
Fact1_inequality                        1.00             -0.51                        0.22
LV2012estimatedIQ                      -0.51              1.00                        0.16
PrisonRatePer100000ICPS2015             0.22              0.16                        1.00

                            Fact1_inequality LV2012estimatedIQ PrisonRatePer100000ICPS2015
Fact1_inequality                         275               119                         117
LV2012estimatedIQ                        119               275                         193
PrisonRatePer100000ICPS2015              117               193                         275

So IQ is slightly positively related to prison rates and so is equality. Positive? Isn’t it bad having people in prison? Well, if the alternative is having them dead… because the punishment for most crimes is death. Although one need not be excessive as the US is. Somewhere in the middle is perhaps best?

What if we combine them into a model?

model = lm(PrisonRatePer100000ICPS2015 ~ Fact1_inequality+LV2012estimatedIQ,ineqprisoniq)
summary = summary(model)
prediction =
colnames(prediction) = "Predicted"
ineqprisoniq = merge.datasets(ineqprisoniq,prediction,1)
scatterplot(PrisonRatePer100000ICPS2015 ~ Predicted, ineqprisoniq,
> summary

lm(formula = PrisonRatePer100000ICPS2015 ~ Fact1_inequality + 
    LV2012estimatedIQ, data = ineqprisoniq)

    Min      1Q  Median      3Q     Max 
-153.61  -75.05  -31.53   44.62  507.34 

                  Estimate Std. Error t value Pr(>|t|)   
(Intercept)       -116.451     88.464  -1.316  0.19069   
Fact1_inequality    31.348     11.872   2.640  0.00944 **
LV2012estimatedIQ    3.227      1.027   3.142  0.00214 **
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 113.6 on 114 degrees of freedom
  (158 observations deleted due to missingness)
Multiple R-squared:  0.09434,	Adjusted R-squared:  0.07845 
F-statistic: 5.938 on 2 and 114 DF,  p-value: 0.003523

> lm.beta(model)
Fact1_inequality LV2012estimatedIQ 
        0.2613563         0.3110241

This is a pretty bad model (var%=8), but the directions held from before but were stronger. Standardized betas .25-.31. The R2 seems to be awkwardly low to me given the betas.

More importantly, the residuals are clearly not normal as can be seen above. The QQ-plot is:


It is concave, so data distribution isn’t normal. To get diagnostic plots, simply use “plot(model)”.

Perhaps try using rank-order data:

ineqprisoniq =,2,rank,na.last="keep")) #rank order the data

And then rerunning model gives:

> summary

lm(formula = PrisonRatePer100000ICPS2015 ~ Fact1_inequality + 
    LV2012estimatedIQ, data = ineqprisoniq)

     Min       1Q   Median       3Q      Max 
-100.236  -46.753   -8.507   46.986  125.211 

                  Estimate Std. Error t value Pr(>|t|)    
(Intercept)        1.08557   18.32052   0.059    0.953    
Fact1_inequality   0.84766    0.16822   5.039 1.78e-06 ***
LV2012estimatedIQ  0.50094    0.09494   5.276 6.35e-07 ***
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 54.36 on 114 degrees of freedom
  (158 observations deleted due to missingness)
Multiple R-squared:  0.2376,	Adjusted R-squared:  0.2242 
F-statistic: 17.76 on 2 and 114 DF,  p-value: 1.924e-07

> lm.beta(model)
 Fact1_inequality LV2012estimatedIQ 
        0.4757562         0.4981808

Much better R2, directions the same but betas are stronger, and residuals look normalish from the above. QQ plot shows them not to be even now.


Prediction plots based off the models:

prison prison_rank

So is something strange going on with the IQ, inequality and prison rates? Perhaps something nonlinear. Let’s plot them by IQ bins:

bins = cut(unlist(ineqprisoniq["LV2012estimatedIQ"]),5) #divide IQs into 5 bins
ineqprisoniq["IQ.bins"] = bins
plotmeans(PrisonRatePer100000ICPS2015 ~ IQ.bins, ineqprisoniq,
          main = "Prison rate by national IQ bins",
          xlab = "IQ bins (2012 data)", ylab = "Prison rate per 100000 (2014 data)")


That looks like “bingo!” to me. We found the pattern.

What about inequality? The trouble is that the inequality data is horribly skewed with almost all countries have a low and near identical inequality compared with the extremes. The above will (does not) work well. I tried with different bins numbers too. Results look something like this:

bins = cut(unlist(ineqprisoniq["Fact1_inequality"]),5) #divide IQs into 5 bins
ineqprisoniq["inequality.bins"] = bins
plotmeans(PrisonRatePer100000ICPS2015 ~ inequality.bins, ineqprisoniq,
          main = "Prison rate by national inequality bins",
          xlab = "inequality bins", ylab = "Prison rate per 100000 (2014 data)")


So basically, the most equal countries to the left have low rates, somewhat higher in the unequal countries within the main group and varying and on average lowish among the very unequal countries (African countries without much infrastructure?).

Perhaps this is why the Equality Institute limited their analyses to the group on the left, otherwise they don’t get the nice clear pattern they want. One can see it a little bit if one uses a high number of bins and ignores the groups to the right. E.g. 10 bins:


Among the 3 first groups, there is a slight upward trend.

I had seen references to this book in a number of places which got me curious. I am somewhat hesitant to read older books since I know much of what they discuss is dated and has been superseded by newer science. Sometimes, however, science (or the science culture) has gone wrong so one may actually learn more reading an older book than a newer one. Since fewer people read older books, one can sometimes find relevant but forgotten facts in them. Lastly, they can provide much needed historical information about the development of thinking about some idea or of some field. All of these remarks are arguably relevant to the race/population genetics controversy.

Still, I did not read the book immediately altho I had a PDF of it. I ended up starting to read it more or less at random due to a short talk I had with John Fuerst about it (we are writing together on racial admixture, intelligence and socioeconomic outcomes in the Americas and also wrote a paper on immigrant performance in Denmark).

So, the book really is dated. It spends hundreds of pages on arcane fysical anthropology which requires one to master human anatomy. Most readers don’t master this discipline, so these parts of the book are virtually un-understandable. However, they do provide one with the distinct impression of how one did fysical anthropology in old times. Lots of observations of cranium, other bones, noses, eyes+lids, teeth, lips, buttocks, etc., and then try to find clusters in these data manually. No wonder they did not reach that high agreement. The data are too scarce to find clusters and humans not sufficiently good at cluster analysis at the intuitive level. Still, they did notice some patterns that are surely correct, such as the division between various African populations, Ainu vs. Japanese, that Europeans are Asians are closer related, that Afghans etc. belong to the European supercluster etc. Clearly, these pre-genetic ideas were not all totally wrong headed. Here’s the table of Races+Subraces from the end of the book. They seem reasonably in line with modern evidence.


Some quotes:

The story of 7 ‘kinds’ of mosquitoes.

[Dobzhansky’s definition = ‘Species in sexual cross-fertilizing organisms can be defined as groups of populations which are reproductively isolated to the extent that the exchange of genes between them is absent or so slow that the genetic differences are not diminished or swamped.’]

Strict application of Dobzhansky’s definition results in certain very similar animals being assigned to different species. The malarial mosquitoes and their relatives provide a remarkable example of this. The facts are not only extreme­ly interesting from the purely scientific point of view, but also of great practical importance in the maintenance of public health in malarious districts. It was discovered in 1920 that one kind of the genus Anopheles, called elutus, could be distinguished from the well-known malarial mosquito, A. maculipennis, by certain minute differences in the adult, and by the fact that its its eggs looked different; but for our detailed knowledge of this subject we are mainly indebted to one Falleroni, a retired inspector of public health in Italy, who began in 1924 to breed Anopheles mosquitoes as a hobby. He noticed that several different kinds of eggs could be distinguished, that the same female always laid eggs having the same appearance, and that adult females derived from those eggs produced eggs of the same type. He realized that although the adults all appeared similar, there were in fact several different kinds, which he could recognize by the markings on their eggs. Falleroni named several different kinds after his friends, and the names he gave are the accepted ones today in scientific nomenclature.

It was not until 1931 that the matter came to the attention of L. W. Hackett, who, with A. Missiroli, did more than anyone else to unravel the details of this curious story.(449,447.448] The facts are these. There are in Europe six different kinds of Anopheles that cannot be distinguished with certainty from one another in the adult state, however carefully they are examined under the microscope by experts; a seventh kind, elutus, can be distinguished by minor differences if its age is known. The larvae of two of the kinds can be distinguished from one another by minute differences (in the type of palmate hair on the second segment, taken in conjunction with the number of branches of hair no. 2 on the fourth and fifth segments). Other supposed differences between the kinds, apart from those in the eggs, have been shown to be unreal.

In nature the seven kinds are not known to interbreed, and it is therefore necessary, under Dobzhansky’s definition, to regard them all as separate species.

The mates of six of the seven species have the habit of ‘swarming’ when ready to copulate. They join in groups of many individuals, humming, high in the air; suddenly the swarm bursts asunder and rejoins. The females recognize the swarms of males of their own species, and are attracted towards them. Each female dashes in, seizes a male, and flies off, copulating.

With the exceptions mentioned, the only visible differences between the species occur at the egg-stage. The eggs of six of the seven species are shown in Fig. 8 (p. 76).

6 anopheles

It will be noticed that each egg is roughly sausage-shaped, with an air-filled float at each side, which supports it in the water in which it is laid. The eggs of the different species are seen to differ in the length and position of the floats. The surface of the rest of the egg is covered all over with microscopic finger-shaped papillae, standing up like the pile of a carpet. It is these papillae that are responsible for the distinctive patterns seen on the eggs of the different species. Where the papillae are long and their tips rough, light is reflected to give a whitish appearance; where they are short and smooth, light passes through to reveal the underlying surface of the egg, which is black. The biological significance of these apparently trivial differences is unknown.

From the point of view of the ethnic problem the most interesting fact is this. Although the visible differences between the species are trivial and confined or almost confined to the egg-stage, it is evident that the nervous and sensory systems are different, for each species has its own habits. The males of one species (atroparvus) do not swarm. It has already been mentioned that the females recognize the males of their own species. Some of the species lay their eggs in fresh water, others in brackish. The females of some species suck the blood of cattle, and are harmless to man; those of other species suck the blood of man, and in injecting their saliva transmit malaria to him.

Examples could be quoted of other species that are distinguishable from one another by morphological differences no greater than those that separate the species of Anopheles; but the races of a single species—indeed, the subraces of a single race—are often distinguished from one another, in their typical forms, by obvious differences, affecting many parts of the body. It is not the case that species are necessarily very distinct, and races very similar. [p. 74ff]

Nature is very odd indeed! More on Wiki.

Some very strange examples of abnormalities of this sort have been recorded by reputable authorities. Buffon quotes two examples of an ‘amour violent’ between a dog and a sow. In one case the dog was a large spaniel on the property of the Comte de Feuillee, in Burgundy. Many persons witnessed ‘the mutual ardour of these two animals; the dog even made prodigious and oft-repeated efforts to copulate with the sow, but the unsuitability of their reproductive organs prevented their union.’ Another example, still more remarkable, occurred on Buffon’s own property. A miller kept a mare and a bull in the same stable. These two animals developed such a passion for one another that on all occasions when the mare was on heat, over a period of several years, the bull copulated with her three or four times a day, whenever he was free to do so. The act was witnessed by all the inhabitants of the place. [p. 92]

Of smelly Japanese:

There is, naturally enough, a correlation between the development of the axillary organ and the smelliness of the secretion of this gland (and probably this applies also to the a glands of the genito-anal region). Briefly, the Europids and Negrids are smelly, the Mongolids scarcely or not at all. so far as the axillary secretion is concerned. Adachi. who has devoted more study to this subject than anyone else, has summed up his findings in a single, short sentence: ‘The Mongolids are essentially an odourless or very slightly smelly race with dry ear-wax.’(5] Since most of the Japanese are free or almost free from axillary smell, they are very sensitive to its presence, of which they seem to have a horror. About 10% of Japanese have smelly axillae. This is attributed to remote Ainuid ancestry, since the Ainu are invariably smelly, like most other Europids, and a tendency to smelliness is known to be inherited among the Japanese. 151 The existence of the odour is regarded among Japanese as a disease, osmidrosis axillae which warrants (or used to warrant) exemption from military service. Certain doctors specialize in its treatment, and sufferers are accustomed to enter hospital. [p. 173]

Japan always take these things to a new level.

Measurements of adult stature, made on several thousand pairs of persons, show a rather close correspondence with these figures, namely, 0 507, 0-322, 0-543, and 0-287 respectively.(172) It will be noticed that the correlations are all somewhat higher than one would expect; that is to say, the members of each pair are, on average, rather more nearly of the same height than the simple theory would suggest. This is attributed in the main to the tendency towards assortative mating, the reality of which had already been recognized by Karl Pearson and Miss Lee in their paper published in 1903. [p. 462]

I didn’t know assortative mating was recognized so far back. This may be a good source to understand the historical development of understanding of assortative mating.

The reference is: Pearson, K. &  Lee,  A.,  1903.  ‘On  the  laws  of  inheritance  in  man.  I.  Inheritance  of  physical characters.’  Biometrika,  2, 357—462.

Definition of intelligence?

What has been said on p. 496 may now be rewritten in the form of a short definition of intelligence, in the straightforward, everyday sense of that word. It is the ability to perceive, comprehend, and reason, combined with the capacity to choose worth-while subjects for study, eagerness to acquire, use, transmit, and (if possible) add to knowledge and understanding, and the faculty for sustained effort towards these ends (cf. p. 438). One might say briefly that a person is intelligent in so far as his cognitive ability and personality tend towards productiveness through mental activity. [p. 495ff]

Baker prefers a broader definition of “intelligence” which includes certain non-cognitive parts. He uses “cognitive ability” like many people do now a days use “general cognitive ability”.

And now surely at the end of the book, the evil master-racist privileged white male John Baker tells us what to do with the information we just learned in the book:

Here, on reaching the end of the book, 1 must repeat some words that I wrote years ago when drafting the Introduction (p. 6), for there is nothing in the whole work that would tend to contradict or weaken them:
Every ethnic taxon of man includes many persons capable of living responsible and useful lives in the communities to which they belong, while even in those taxa that are best known for their contributions to the world’s store of intellectual wealth, there are many so mentally deficient that they would be inadequate members of any society. It follows that no one can claim superiority simply because he or she belongs to a particular ethnic taxon. [p. 534]

So, clearly according to our anti-racist heroes, Baker tells us to revel in our (sorry Jayman if you are reading!) European master ancestry, right?

edited: removed joke because public image -_-

Richard Lynn is so nice to periodically send me books for free. He is working on establishing his publisher, of course, and so needs media coverage.

In this case, he sent me a new book on the Roma by Jelena Cvorovic who was also present at the London conference on intelligence in the spring 2014. She has previously published a number of papers on the Roma from her field studies. Of most interest to differential psychologists (such as me), is that they obtain very low scores on g tests not generally seen outside SS Africa. In the book, she reviews much of the literature on the Roma, covering their history, migration in Europe, religious beliefs and other strange cultural beliefs. For instance, did you know that many Roma consider themselves ‘Egyptians’? Very odd! Her review also covers the more traditional stuff like medical problems, sociological conditions, crime rates and the like. Generally, they do very poorly, probably only on par with the very worst performing immigrant groups in Scandinavia (Somalia, Lebanese, Syrians and similar). Perhaps they are part of the reason why people from Serbia do so poorly in Denmark. Perhaps they are mostly Roma? There are no records of more specific ethnicities in Denmark for immigrant groups to my knowledge. Similar puzzles concern immigrants coded as “stateless” which are presumably mostly from Palestine, immigrants from Israel (perhaps mostly Muslims?) and reversely immigrants from South Africa (perhaps mostly Europeans?).

Another interesting part of the book concerns the next last chapter covering the Roma kings. I had never heard of these, but apparently there are or were a few very rich Romas. They built elaborate castles for their money which one can now see in various places in Eastern Europe. After they lost their income (which was due to black market trading during communism and similar activities), they seem to have reverted to the normal Roma pattern of unemployment, fast life style, crime and state benefits. This provides another illustration of the idea that if a group of persons for some reason acquire wealth, it will not generally boost their g or other capabilities, and their wealth will go away again once the particular circumstance that gave rise to it disappears. Other examples of this pattern are the story of Nauru and people who get rich from sports but are not very clever (e.g. African American athletes such as Mike Tyson). Oil States have also not seen any massive increase in g due to their oil riches nor are people who win lotteries known to suddenly acquire higher g. Clearly, there cannot be a strong causal link from income to g.

In general, this book was better than expected and definitely worth a read for those interesting in psychologically informed history.

G.M. IQ & Economic growth

I noted down some comments while reading it.

In Table 1, Dominican birth cohort is reversed.


“0.70 and 0.80 in world-wide country samples. Figure 1 gives an impression of

this relationship.”


Figure 1 shows regional IQs, not GDP relationships.

“We still depend on these descriptive methods of quantitative genetics because

only a small proportion of individual variation in general intelligence and

school achievement can be explained by known genetic polymorphisms (e.g.,

Piffer, 2013a,b; Rietveld et al, 2013).”


We don’t. Modern BG studies can confirm A^2 estimates directly from the genes.


Davies, G., Tenesa, A., Payton, A., Yang, J., Harris, S. E., Liewald, D., … & Deary, I. J. (2011). Genome-wide association studies establish that human intelligence is highly heritable and polygenic. Molecular psychiatry, 16(10), 996-1005.

Marioni, R. E., Davies, G., Hayward, C., Liewald, D., Kerr, S. M., Campbell, A., … & Deary, I. J. (2014). Molecular genetic contributions to socioeconomic status and intelligence. Intelligence, 44, 26-32.

Results are fairly low tho, in the 20’s, presumably due to non-additive heritability and rarer genes.


“Even in modern societies, the heritability of

intelligence tends to be higher for children from higher socioeconomic status

(SES) families (Turkheimer et al, 2003; cf. Nagoshi and Johnson, 2005; van

der Sluis et al, 2008). Where this is observed, most likely environmental

conditions are of similar high quality for most high-SES children but are more

variable for low-SES children. “


Or maybe not. There are also big studies that don’t find this interaction effect.


“Schooling has

only a marginal effect on growth when intelligence is included, consistent with

earlier results by Weede & Kämpf (2002) and Ram (2007).”

In the regression model of all countries, schooling has a larger beta than IQ does (.158 and .125). But these appear to be unstandardized values, so they are not readily comparable.

“Also, earlier studies that took account of

earnings and cognitive test scores of migrants in the host country or IQs in

wealthy oil countries have concluded that there is a substantial causal effect of

IQ on earnings and productivity (Christainsen, 2013; Jones & Schneider,



National IQs were also found to predict migrant income, as well as most other socioeconomic traits, in Denmark and Norway (and Finland and the Netherland).

Kirkegaard, E. O. W. (2014). Crime, income, educational attainment and employment among immigrant groups in Norway and Finland. Open Differential Psychology.

Kirkegaard, E. O. W., & Fuerst, J. (2014). Educational attainment, income, use of social benefits, crime rate and the general socioeconomic factor among 71 immigrant groups in Denmark. Open Differential Psychology.



Figures 3 A-C are of too low quality.



“Allocation of capital resources has been an

element of classical growth theory (Solow, 1956). Human capital theory

emphasizes that individuals with higher intelligence tend to have lower

impulsivity and lower time preference (Shamosh & Gray, 2008). This is

predicted to lead to higher savings rates and greater resource allocation to

investment relative to consumption in countries with higher average



Time preference data for 45 countries are given by:

Wang, M., Rieger, M. O., & Hens, T. (2011). How time preferences differ: evidence from 45 countries.

They are in the megadataset from version 1.7f

Correlations among some variables of interest:

             SlowTimePref   IQ lgGDP
SlowTimePref         1.00         0.45         0.48 0.57  0.64         0.45         1.00         0.89 0.55  0.59         0.48         0.89         1.00 0.65  0.66
IQ                   0.57         0.55         0.65 1.00  0.72
lgGDP                0.64         0.59         0.66 0.72  1.00

             SlowTimePref  IQ lgGDP
SlowTimePref          273           32           12  45    40           32          273           20  68    58           12           20          273  23    20
IQ                     45           68           23 273   169
lgGDP                  40           58           20 169   273

So time prefs predict income in DK and NO only slightly worse than national IQs or lgGDP.



“Another possible mediator of intelligence effects that is difficult to

measure at the country level is the willingness and ability to cooperate. A

review by Jones (2008) shows that cooperativeness, measured in the Prisoner‟s

dilemma game, is positively related to intelligence. This correlate of

intelligence may explain some of the relationship of intelligence with

governance. Other likely mediators of the intelligence effect include less red

tape and restrictions on economic activities (“economic freedom”), higher

savings and/or investment, and technology adoption in developing countries.”


There are data for IQ and trust too. Presumably trust is closely related to willingness to cooperate.

Carl, N. (2014). Does intelligence explain the association between generalized trust and economic development? Intelligence, 47, 83–92. doi:10.1016/j.intell.2014.08.008



“There is no psychometric evidence for rising intelligence before that time

because IQ tests were introduced only during the first decade of the 20th

century, but literacy rates were rising steadily after the end of the Middle Age

in all European countries for which we have evidence (Mitch, 1992; Stone,

1969), and the number of books printed per capita kept rising (Baten & van

Zanden, 2008).”


There’s also age heaping scores which are a crude measure of numeracy. AH scores for 1800 to 1970 are in the megadataset. They have been going up for centuries too just like literacy scores. See:

A’Hearn, B., Baten, J., & Crayen, D. (2009). Quantifying quantitative literacy: Age heaping and the history of human capital. The Journal of Economic History, 69(03), 783–808.



“Why did this spiral of economic and cognitive growth take off in Europe

rather than somewhere else, and why did it not happen earlier, for example in

classical Athens or the Roman Empire? One part of the answer is that this

process can start only when technologies are already in place to translate rising

economic output into rising intelligence. The minimal requirements are a

writing system that is simple enough to be learned by everyone without undue

effort, and a means to produce and disseminate written materials: paper, and

the printing press. The first requirement had been present in Europe and the

Middle East (but not China) since antiquity, and the second was in place in

Europe from the 15thcentury. The Arabs had learned both paper-making and

printing from the Chinese in the 13thcentury (Carter, 1955), but showed little

interest in books. Their civilization was entering into terminal decline at about

that time (Huff, 1993). “


Are there no FLynn effects in China? They still have a difficult writing system.


“Most important is that Flynn effect gains have been decelerating in recent

years. Recent losses (anti-Flynn effects) were noted in Britain, Denmark,

Norway and Finland. Results for the Scandinavian countries are based on

comprehensive IQ testing of military conscripts aged 18-19. Evidence for

losses among British teenagers is derived from the Raven test (Flynn, 2009)

and Piagetian tests (Shayer & Ginsburg, 2009). These observations suggest

that for cohorts born after about 1980, the Flynn effect is ending or has ended

in many and perhaps most of the economically most advanced countries.

Messages from the United States are mixed, with some studies reporting

continuing gains (Flynn, 2012) and others no change (Beaujean & Osterlind,



These are confounded with immigration of low-g migrants however. Maybe the FLynn effect is still there, just being masked by dysgenics + low-g immigration.



“The unsustainability of this situation is obvious. Estimating that one third

of the present IQ differences between countries can be attributed to genetics,

and adding this to the consequences of dysgenic fertility within countries,

leaves us with a genetic decline of between 1 and 2 IQ points per generation

for the entire world population. This decline is still more than offset by Flynn

effects in less developed countries, and the average IQ of the world‟s

population is still rising. This phase of history will end when today‟s

developing countries reach the end of the Flynn effect. “Peak IQ” can

reasonably be expected in cohorts born around the mid-21stcentury. The

assumptions of the peak IQ prediction are that (1) Flynn effects are limited by

genetic endowments, (2) some countries are approaching their genetic limits

already, and others will fiollow, and (3) today‟s patterns of differential fertility

favoring the less intelligent will persist into the foreseeable future. “


It is possible that embryo selection for higher g will kick in and change this.

Shulman, C., & Bostrom, N. (2014). Embryo Selection for Cognitive Enhancement: Curiosity or Game-changer? Global Policy, 5(1), 85–92. doi:10.1111/1758-5899.12123



“Fertility differentials between countries lead to replacement migration: the

movement of people from high-fertility countries to low-fertility countries,

with gradual replacement of the native populations in the low-fertility

countries (Coleman, 2002). The economic consequences depend on the

quality of the migrants and their descendants. Educational, cognitive and

economic outcomes of migrants are influenced heavily by prevailing

educational, cognitive and economic levels in the country of origin (Carabaña,

2011; Kirkegaard, 2013; Levels & Dronkers, 2008), and by the selectivity of

migration. Brain drain from poor to prosperous countries is extensive already,

for example among scientists (Franzoni, Scellato & Stephan, 2012; Hunter,

Oswald & Charlton, 2009). “


There are quite a few more papers on the spatial transferability hypothesis. I have 5 papers on this alone in ODP:

But there’s also yet unpublished data for crime in Netherlands and more crime data for Norway. Papers based off these data are on their way.