I recently read the biography of William Shockley. Basically, I am reading biographies of prominent researchers to gain an understanding of them. I was also thinking of writing one about Arthur Jensen, having already set up a website for him and read most of his writings and writings about him.


The book is written by a journalist with a poor understanding of differential psychology, so he is critical against IQ testing at various places in the book. Just ignore that. It is however curious that Shockley took a number of IQ tests and never scored particularly highly: scores i the 120-130 region, yet people described him as very bright. My guess for this is that his score was very tilted. A high level of general cognitive ability + strong tilt + test ceilings will mean that the test have a downward bias. This is because your strong abilities will reach the ceiling and thus get downward biased while the weaker abilities will not and thus be accurately measured. The average of no bias and downward bias is some downward bias.

Shockley sounds like a person who would have a strong tilt. My reasoning is based on the stereotype that people in tech have weaker social abilities and are tilted away from verbal abilities. I don’t have any good evidence for the first, but see this post for the second. A second part of my reasoning is that Shockley was extremely insensitive. My hypothesis is that interpersonal sensitivity is correlated with a verbal tilt. This holds between the genders for instance (women higher in both), my hypothesis is simply that it is a more general pattern that holds within genders too. I am not familiar with any strong evidence for this.

Aside from various youtube vidoes one can find of him debating IQ and race on TV in the 60s and 70s, here’s a quote illustrating his insensitivity:

The late William Shockley once spoke of Nature as having “color-coded groups of individuals so that statistically reliable predictions of their adaptability to intellectually rewarding and effective lives can easily be made and profitably be used by the pragmatic man in the street” (Shockley, 1972, p. 307). This was an unfortunate choice of metaphor, from at least two standpoints. First, it is misleading in a very fundamental way. The point of color coding electrical or electronic components is to let the user know what the internal characteristics of the device are from the external color code. The scheme works because the user can trust manufacturers to supply components bearing a given color code that are uniform in the coded-for property, and distinct from components of other colors. If electrical and electronics manufacturers did as badly as Nature apparently has–so that components bearing any color code varied widely among themselves, and overlapped extensively with components of other color codes–users would abandon the color coding as worthless, and resort to direct tests on the components themselves to find one that in fact possesses the properties desired.

A second objection to the metaphor is that it is value-laden, and the values are not very sympathetic ones. Why should ordinary people, of any color, be equated to simple electrical or electronic components whose only role is as interchangeable parts in more complex systems, and why should Nature be arranging things for the benefit of personnel managers anyway? I venture to say that if Professor Shockley–or you or I – – h a d written down the quoted sentence and then stopped to think whether it might give a misleading impression or unintended offense, you or I – – a n d perhaps even Professor Shockley–would have wound up saying it differently.

The quote is from the little known editorial by Loehlin: Should we do research on race differences in intelligence

I propose the same explanation for another socially odd, but smart person who did not score particularly high on tests: Richard Feymann.

I have written to Thomas Coyle to hear if he knows about any studies about these proposed relationships.

Wikipedian battlegrounds

For those who don’t know, Wikipedia is a common battleground for the ideological part of the race and intelligence debate. One can see this in the talk pages of these articles. See also the discussion of the phenomenon here.

The most active pushers of environment-only right now are WeijiBaikeBianji and maunus. WBB got a temp ban for it recently, but it has been lifted, so he is back in business. Probably the easiest indication of his extreme bias is his own compiled list of good sources. A search reveals that there is not a single citation of Jensen, the most prominent researcher in this area. There are also 0 for Richard Lynn, and Phil Rushton, i.e. the three grand old men of the hereditarian side. On the other hand, one will find 9 references to Flynn, 17 to Sternberg and 2 to Nisbett.

Not surprisingly, due to the boringness of engaging in edit warring and because the Wikipedia source code is free, others have set up Wikipedias more suitable to their ideologies. Perhaps the most comical is the ultra-conservative creationist Wikipedia Conservapedia. Of more interest is RationalWiki, a Wiki centered on rationality and pseudoscience. The general content of the Wiki is good, e.g. on NLP or dowsing. However, it is awful on matters that the American left-wing does not work, including race and intelligence. Unsurprisingly, the combination is even worse, for instance in the article on The Bell Curve:

The Bell Curve is a highly controversial 1994 book by Richard Herrnstein and Charles Murray. It purports to show that intelligence is the most dominant factor in the trajectory of each person’s life, and it serves to predict such things as socioeconomic status and tendencies towards criminal behavior. The book has served as a platform for many modern-day racists, giving them an “intellectual” basis and source of data to support many of their beliefs. Quite a bit of the research compiled within The Bell Curve is not disputed, but the conclusions drawn from it are considered by many to be bunk, and it has been criticized for aiding racist ideologies.

One of the admins of the website, Krom, is currently engaging John Fuerst in discussion at the OpenPsych forum. The thread is up to >34 pages by now.

All in all, RationalWiki is an okay source but mind the topic. Alternatives include TalkOrigins (for creationism), SkepticalScience (for climate denialism), as well as the Skeptic’s dictionary which covers many areas.


Some time ago I noticed that someone had set up another Wiki that does not fret about the race and intelligence stuff. I decided to give the article a read since I’m an expert in this area. To my surprise, it is fairly updated. While not quite as good a review of the facts as John Fuerst’s old 2012 review, it is more up to date, even covering findings published at HumanVarieties.

When I read the Race and Intelligence page I had a lot of comments, but just kept them in my head. However, another idea is for me to conduct a review here. I am too busy to spend time re-writing it. The page seems to be mostly written by an anonymous swede.

Review of Metapedia’s Race and Intelligence

Quotes are from the article unless otherwise stated.

Race differences in intelligence was historically a common view. For example, Muslim writers stated low intelligence among Blacks.[1][2]

Early scientific research started in the nineteenth century and included methods such as skull and brain measurements.[3]

The first IQ test was created in 1905. By the end of the twentieth century many hundreds of studies on racial IQ differences had been published.[3]

Race research in general, including also race and intelligence research, become increasingly a taboo subject after WWII. During this time the Pioneer Fund was influential in keeping some research and debate alive.

Galton is not mentioned but should be. He made the first quantitative estimates of racial differences in intelligence. They were surprisingly precise despite Galton not having any actual mental tests. Galton traveled widely in Africa and formed his judgment on that basis. Jensen wrote a lengthy review of Galton’s lasting influence.

In 1969 Arthur Jensen caused great public controversy with the article “How Much Can We Boost IQ and School Achievement?” in which he argued for genetics being an important explanation for the measured differences.[4]

Here they should cite the original article. Because the article was so influential, they should do a direct quote of his words. Jensen did not state his case quite as strongly as the writing makes it seem.

Richard Lynn in his 2006 book Race Differences in Intelligence reviewed the literature on worldwide IQ testing and calculated the average IQs for different races based on earlier IQ test results (citing hundreds of different studies testing the average IQ of different races).


This should be in a table.

They should mention the debate about the Sub-Saharan African IQs with Wicherts, Lynn, Meisenberg, and Rindermann. See citations in Rindermann, H. (2013). African cognitive ability: Research, results, divergences and recommendations. Personality and Individual Differences, 55(3), 229-233.

The average IQ of the world as of 2000 has been estimated to be 90 based on estimated average country IQs and country population sizes.[8]

A histogram of the IQs seems in order here.

The US group “Hispanics” is a diverse group that may have European, Amerindian, and Sub-Saharan African origins in varying proportions. Most are of mixed Amerindian/European origin.

In the US the tested average IQ of “Hispanics” is typically intermediate between that of Blacks and Whites.[12] Both the above mentioned 2001 meta-analysis and the book Race Differences in Intelligence found an average IQ of 89.[3]

Probably mention the results from Fuerst’s 2014 review: Fuerst, J. (2014). Ethnic/Race Differences in Aptitude by Generation in the United States: An Exploratory Meta-analysis. Open Differential Psychology.

Several studies of the IQ of Gypsies, a people of South Asian origin living in Europe since several centuries but with little intermarrying with other groups, have found average IQs ranging from 70 to 83.[16]

See also Gypsies: Intelligence.

They should cite the results from the meta-analysis presented at the London Conference of Intelligence. The presentation is here.

The 2015 book The Nature of Race stated that “the cognitive ability scores of international migrants tend to correlate with the cognitive ability scores of those from the regions of origin. That is, to some extent, contemporaneous migrants carry their region of origin abilities with them and the differences brought persist at least until the second or third generation (Carabaña, 2011; De Philippis, 2013; Fuerst, 2014; Kirkegaard, 2015).”[18]

They should not cite long quotes like this (in line) and not giving the references for the cited literature. Readers can’t know what (Kirkegaard, 2015) refers to without looking up that study. I don’t even know which study it is.

The book The Nature of Race stated that “As Baten and Juif (2013) note, the international cognitive ability differences are not new and they precede the event of mass schooling. As such, Baten and Sohn (2013) found that Korea, China and Japan had high numeracy levels in the 1600s; Juif and Baten (2013) found that Spanish and Portuguese had higher numeracy levels than Amerindian Incans in the 1500s. Juif and Baten (2013) also found that 1820 cohort ethnic/national cognitive ability levels predicted 21st century national levels.”[18]

They should show a scatter plot of the Age Heaping x IQ scores. See my earlier post.

There have been some estimates of the IQ:s of monkeys, apes, Homo habilis, and Homo erectus based on how far they progress or have been estimated to progress on Piaget’s cognitive stages of development. Monkeys have been estimated to have an IQ of about 12, apes about 22, Homo habilis about the same as apes, and Homo erectus about 50.[3] Some apes have in captivity been taught limited language skills. However, sceptics have doubted that the claimed language abilities are real.[22][23]

Lynn’s speculations probably do not warrant inclusion. The last two references should not be cited.

Those arguing for a genetic explanation, sometimes referred to as “hereditarians”, are frequently subjected to various forms of ad hominem personal attacks. This may include accusations of being “racist” (in some extremely negative sense), associations with claimed “racists”, claimed “racists” using the research etc. Obviously ad hominem personal attacks are not scientifically valid arguments regarding what causes the racial IQ differences.

These claims probably warrant some examples. RationalWiki is full of them. One can also cite stuff from Nisbett and Sternberg’s writings.

Denying the existence of races may be used as an attempted argument against race and IQ research. However, it should be noted that even if races in the sense of subspecies were proven to be incorrect, then this does not actually make the genetic debate disappear. Blacks and Whites would still differ genetically regarding, for example, the genes for skin color and the genetically determined prevalence of sickle-cell anemia. So they could differ also regarding IQ genes.

Furthermore, IQ is likely affected by a very large number of genes. This means that even if the population differences regarding the population frequencies of individual gene variants affecting IQ are all small, but these population frequencies correlate, then the total effect of many such small but correlated differences may be that the population differences regarding genetic effects on IQ are very large.

It is perfectly possible to study the role of genetics as an explanation for differences between groups that are not subspecies. Current examples would include the enormous amount of medical research regarding the genetic differences between those having a certain disease and those not having this disease.

Zero references are given for this section. Including a few as well as perhaps a figure would be in order. Something like this perhaps blogs.discovermagazine.com/gnxp/2013/05/why-race-as-a-biological-construct-matters/

IQ values such as the 54 for Bushmen have been by some as implausible low since it would a give a diagnosis of mental retardation in European countries. However, Europeans with more severe forms of mental retardation often have genetic diseases that cause many other problems beside the low IQ. A better comparison is with European children with comparable IQ. An IQ of 54 is equivalent to the average IQ of European 8 years old children. These can learn to read, write, and do arithmetic. Historically the great majority European children worked productively at this age. This is also the case today for many children of this age in developing nations.[3]

It is worth mentioning that there are very few studies of these groups. Lynn is not known for being diligent with his exact numbers, see Malloy’s review, e.g. for Thailand. I seem to recall the Bushmen value was incorrect.

Controlling for different average socioeconomic status (SES) of Blacks and Whites only reduces the Black-White IQ gap by a third or 5 points. Furthermore, if the Black-White IQ gap is in part caused by genetics, then this number is overstated since the Black-White SES gap is partially caused by the Black-White IQ gap.[4] Not considering such effects is one example of The sociologist’s fallacy.

Here it is also worth citing Sesardic’s excellent book.

A 2014 genetic study, although not studying racial IQ differences, found that “using a new technique applied to DNA from 3000 unrelated children, we show significant genetic influence on family SES, and on its association with children’s IQ…our results emphasize the need to consider genetics in research and policy on family SES and its association with children’s IQ.”[30]

The citation of only this study makes it seem like this is a new idea or new finding. In fact Rowe et al reported a finding like this in the late 1990s. The idea is fairly obvious and was first promoted widely by Herrnstein in the 1970s. There are also multiple other newer studies also using GCTA finding narrow, common variant h values around .20.

Rowe, D. C., Vesterdal, W. J., & Rodgers, J. L. (1998). Herrnstein’s syllogism: Genetic and shared environmental influences on IQ, education, and income. Intelligence, 26(4), 405-423.

Marioni, R. E., Davies, G., Hayward, C., Liewald, D., Kerr, S. M., Campbell, A., … & Deary, I. J. (2014). Molecular genetic contributions to socioeconomic status and intelligence. Intelligence, 44, 26-32.

An environment only explanation for the Black-White IQ gap predicts that the IQ gap should be smaller at higher levels of parental SES since these children should be less exposed to the environmental factors lowering IQ. However, the gap is actually equal or larger at higher parental SES levels.[4] In contrast, hereditarians can explain this by regression to different racial genetic averages (see the section “Regression to the mean” below).[31] Another explanation is Black parents having lower average genetic IQ than White parents despite having similar SES. This may be due to factors such as affirmative action causing discrimination against Whites in education/employment.[32]

That does not have to be the case. But yes, a simple environment only model based only on AAs receiving more negative environment factors that are only present in substantial amounts in the lower S levels of societies would imply this. Such a model is falsified.

One early view was that the US Black-White IQ gap was caused by the segregated schools. However, the 1954 Supreme Court decision against segregated schooling and the consequent nationwide program of school busing did not cause the gap to disappear. Furthermore, the Coleman Report found little support for the schools being an important explanation for the Black-White IQ gap or IQ results in general. Negligible, and in some cases, negative correlations were found between IQ and variables such as pupil expenditure, teachers’ salaries, teachers’ qualifications, student/teacher ratios, and the availability of other school professionals. Also, IQ group differences are found also in European countries with desegregated schools. Hereditarians have furthermore argued that the Black-White IQ gap is equally large for 3-year-old children who have not yet started school.[4][3]

Coleman report is mentioned but not cited properly. No references for the European results or their exact nature.

Similar diagram based on the 1994 book The Bell Curve.

No actual source given for this figure.

Certain factors that are common in developing countries like iodine deficiency during pregnancy/childhood and certain tropical diseases like malaria are known to affect IQ negatively. However, these factors are very rare in developed countries and thus cannot explain for example the US Black-White IQ gap.

That malnutrition would be more common among Blacks than Whites in the US is argued to be excluded the absence of height differences and nutritional studies.[3]

They keep citing Lynn 2006 for all kinds of claims. This is not proper. They should cite primary literature showing that these diseases are not important factors. I don’t know such studies, maybe they exist. A recent Faroe Island study found that prenatal mercury poisoning may have a small effect on later IQ (2.2 IQ points per 10 fold increase), even controlling for maternal IQ.

Debes, F., Weihe, P., & Grandjean, P. (2015). Cognitive deficits at age 22 years associated with prenatal exposure to methylmercury. Cortex.

A 2013 study examined to what degree the average country IQ differences are caused by poor living conditions at or near the test-takers’ time of birth and stated that “The paper finds that the impact of living conditions is of much smaller magnitude than is suggested by just looking at correlations between average IQ scores and socioeconomic indicators…As far as IQ and the wealth of nations are concerned, causality thus appears to run mostly from the former to the latter. The test-takers’ region of ancestry dominates the regression results. While differences in average scores worldwide can thus be plausibly viewed as being influenced by genetic differences across world regions, it is also possible that score differences are influenced by regional differences in culture that are independent of genetic factors. Differences in average IQ across world regions may change in the years ahead insofar as the strength of Flynn effects may not be uniform, but some regional differences in average g levels seem likely to continue indefinitely.”[33]

Another long quote that is given in-text instead of block. This presentation inoptimality is very common. I shall not mentioned it more times.

There may be problems with testing US immigrants or other persons who are not native English speakers if using English verbal IQ tests. On the other hand, such groups may be tested with non-verbal tests such as Raven’s Progressive Matrices. Hereditarians have argued that studies have shown that IQ test scores predict school grades and job performance equally well for Africans as they do for non-Africans[4][10]

The scores from some countries may be uncertain due to factors such as only small studies being available. On the other hand there are also very large scale international student assessment tests that avoid many of these problems. See the article Countries and intelligence.

They fail to cite the most important book in this area, Jensen’s 1980 book.

They cite my study about item-level Raven’s. They could cite that for lack of bias here as well. The item-difficulty scores had very high cross-sample correlations: .88. emilkirkegaard.dk/en/?p=4971

Stereotype threat is an argued fear that a person’s behavior will confirm an existing stereotype of a group to which the person belongs. This may in turn lead to an impairment of the person’s performance. This has been seen as one explanation for the racial gaps.

Early laboratory experiments finding a stereotype threat effect has been greatly misreported, in both popular and academic literature, as showing that stereotype threat explains the whole gap.[34] An unpublished “meta-analysis of 55 published and unpublished studies of this effect shows clear signs of publication bias. The effect varies widely across studies, and is generally small. Although elite university undergraduates may underperform on cognitive tests due to stereotype threat, this effect does not generalize to non-adapted standardized tests, high-stakes settings, and less academically gifted test-takers. Stereotype threat cannot explain the difference in mean cognitive test performance between African Americans and European Americans.”[35][36]

They should cite prominent authors invoking this explanation. Surely Flynn, Sternberg and Nisbett will provide the necessary text material.

They cite Wichert’s conference presentation, but not the actual published meta-analysis which does not make sense. They should also cite work by Lee Jussim on this topic. Steve Sailer is an inappropriate source.

Flore, P. C., & Wicherts, J. M. (2015). Does stereotype threat influence performance of girls in stereotyped domains? A meta-analysis. Journal of school psychology, 53(1), 25-44.

Jussim, L. (2012). Social perception and social reality: Why accuracy dominates bias and self-fulfilling prophecy. Oxford University Press.

In the middle of the twentieth century a large number of early childhood intervention programs, such as the Head Start program, were tried with one expectation being that these would eliminate or substantially reduce various IQ gaps including the racial IQ gaps. Large initial IQ gains were also found but the initial enthusiasm declined as it become apparent that the IQ or achievement tests gains soon faded away as the children grew older. For example, a 1995 review of 36 such early intervention programs found no consistent pattern of lasting effects on IQ or achievement tests.[37]

A few of the programs have found longer lasting effects contrary to this general pattern. The most well-known may be the Abecedarian Early Intervention Project which found limited IQ gains lasting to adulthood. However, there have been various criticisms. One is that there is evidence for the intervention and control groups being dissimilar due to pure chance (e.g., sampling error) or non-random attrition of participants. The other few claimed exceptions have been criticized due to poor methodology, “teaching the test”, and even a conviction of misuse of federal funds. A 2014 article stated lack of good evidence for anything except a null (or small) long-term effect from intervention programs (as well as adoptions) on the g factor.[37][38][39]

Here they fail to cite Nijenhuis and my paper on the g-loadedness of Headstart meta-analysis.

Furthermore, a large meta-analysis of these intervention studies showed that the favorite studies cited by sociologists are statistical flukes, i.e. it is citation bias. The authors who collected the studies did however not do the publication bias analysis, or they didn’t publish it. But I did it and published it just on Twitter. It is worth publishing of course, but I have been too busy. Someone else can write it up and add me as senior author.

te Nijenhuis, J., Jongeneel-Grimen, B., & Kirkegaard, E. O. (2014). Are Headstart gains on the g factor? A meta-analysis. Intelligence, 46, 209-215.

Unable to display PDF
Click here to download

Authors don’t list the data in the study, but they were so nice as to share it with me. Remember to cite their study if you use it!


The Minnesota Trans-Racial Adoption Study studied 265 children adopted by White upper-middle-class parents with an average IQ of 120. Despite this similar environment, consistent racial differences were found on IQ, school grades, class ranks, and aptitude tests. At age 17 Whites scored 106, mixed race 99, and Blacks 89. 89 was also the average score for Blacks in general in Minnesota. The same difference between mixed race and Black children occurred also in in some cases in which the adopting parents wrongly thought that mixed race children had two Black parents. These results caused considerable debate. Non-hereditarians have raised objections such as the Black and mixed race children having psychological problems due to identity issues, possibly being placed in relatively poorer homes, and being adopted later and having more prior foster home placements both of which is associated with lower IQ. Hereditarians have argued that none of these are convincing explanations.[10][4]

Should probably also cite discussion in Jensen 1998 as well as Loehlin’s 2000 book chapter.



Loehlin, J. C. (2000). Group differences in intelligence. Handbook of intelligence, 176-193.

I also think Flynn has discussed it but not sure which books.

There are also three studies of adopted East Asian Children who in some cases were malnourished and adopted late. Despite this, and presumably also identity issues, they scored highly on IQ tests.[10]

Cite the primary study. It’s a very small study by Lynn, n=22 or something.

A study comparing 83 German White children with 98 mixed race children born to post-WWII German mothers and “Black” soldiers found only very small IQ differences. As for adoption studies on young children, this study has been criticized for not having any follow-up when the children were older. 20-25% of the “Black” soldiers were from French North Africa. The soldiers from the US almost certainly had higher IQ than the average Black due to Army General Classification Test excluding 30% of Blacks.[10] Furthermore, the results for the White children differed greatly for the boys and the girls. The expectation would be similar results and the large difference may be an indication of methodological/sampling problems with the study.

There is also a study finding a significant IQ difference between mixed race children born to White mothers versus Black mothers. This is argued to support an environmental explanation. Again it has been criticized for only testing the children when young. The White mothers had longer education and thus likely a higher IQ. The two groups also scored intermediately between the average IQs of the Black and White children in the study.[10]

Cite primary literature, not reviews by R&J for the 100th time.

US Blacks have on average a low degree of European ancestry. If the partially genetic explanation is correct, then one would expect that those with more European genetics would have higher IQ and brain weight. Studies using skin color as an indirect measure of the degree of European genes of Blacks have found weak such correlations. The weak correlation is argued to be explained by skin color in African Americans being only a weak indicator of the degree of African Ancestry. The results are argued to support that 50-75% of the IQ gap is explained by genetic factors. Non-hereditarians have argued that not all studies have found this correlation but non-hereditarians have argued that this was due to these studies being too small. Another argument is that the correlation may be due to societal advantages causing higher IQ for Blacks with lighter skin color. Hereditarians have argued that this is unlikely, East Asians have been discriminated against but still do not score low on IQ tests, and a study regarding if Blacks with darker skin color were more discriminated found no or contradictory results.[43][44][4]

The best resource here is the meta-analysis Fuerst did, see the presentation mentioned earlier. There is a small but positive relationship between brighter skin and outcomes, as expected by genetic hypothesis.

There are also several other kinds of empirical evidence against “colorism” (skin color differences as the cause of group differences due to factors such as racism). For example, darker skin color not being associated with more negative outcomes after controlling for IQ differences.[45]

Here the article cites an entire category of posts on HV. That’s not right. Cite specific posts. Better, write a review article.

Another method is by examining the relationship between the degree of European blood groups and IQ. Two studies from the 1970s argued that there were no such relationship. These have been criticized for using genetic makers that would have been unable to detect a relationship.[10][46]

Here it is best to cite Loehlin et al’s early review book of the evidence. They discuss it further. The books appears to be mostly forgotten by now. Google Scholar doesn’t have a proper citation for it, Amazon neither.

Loehlin, J.; Lindzey, G., and Spuhler, J.. (1975). Race Differences in Intelligence.

A study from the 1930s using self-reported degree of European ancestry found a small negative correlation with IQ. However, self-report has been criticized as being very uncertain and in particular regarding possible race mixing during the slavery period.[10]

Cite primary study. Also, cite Nisbett’s discussion of it for environmentalist view.

Nisbett, R. E. (2009). Intelligence and how to get it: Why schools and cultures count. WW Norton & Company.

Hereditarians have pointed out more recent studies in the US, Brazil, and South Africa finding that the average IQ of the populations of mixed Black and White origin is intermediate between that of Blacks and Whites. In the case of the US study explanations based on social class and “discrimination based on skin tone” were argued to be ruled out or controlled for. In certain US areas where Blacks have a low degree European ancestry their average IQ is unusually low.[10][4][3]

Again, primary sources needed.

Also, an actual study of AAs European% and NAEP scores did show expected relationships: African% -.31, Euro% .28, Amer .14. On the other hand, for Hispanics, one of the relationships was in the wrong direction: African .27, Amer -.24, Euro .11. Because there are generational changes with Hispanic scores, the AA ones weigh stronger. emilkirkegaard.dk/en/?p=4648

A 2014 article found 31 genetic admixture studies which reported, for individuals residing in the Americas, associations between continental ancestry (e.g., European, Amerindian, Sub-Saharan African, East Asian, and Pacific Islander) and some index of educational attainment or socioeconomic status. None of the associations went in a direction opposite to that predicted by the average IQ scores of the ancestral populations. The results were argued to “support a racial hereditarian hypothesis along with others that predict a fairly internationally consistent association between continental ancestry and cognitively correlated indices of socioeconomic status such as education, income, and job prestige”.[47]

They cite Fuerst’s early post on the admixture project for individual level data. It is better to cite the presentation since it contains more studies (48). It’s also not an article, it’s a blogpost.

Regarding immigrant cognitive ability it has been stated that “The matter, of course, is complicated by migrant selectivity, ethnic identification attrition, differential breeding patterns, non-trivial environmental influence on measures, and so on. Yet, were a racial hereditarian position correct, one would expect to find, when looking across numerous countries, a robust statistical association between region of origin scores and migrant scores.” Studies on the average cognitive ability of various immigrant groups are argued to find similarities with region of origins for at least several generations after the immigration which have been argued to support a partially genetic explanation.[18]

West Indians of African origin that emigrate to the US are more successful than US Blacks. This has been seen as possible evidence for some environmental factor affecting US Blacks negatively. Another explanation is that these emigrants are not random sample of West Indians of African origin but a selected group with unusual characteristics. One 2008 study wrote that “West Indian success can be attributed entirely to the greater talent and ambition of those who choose to move. Similarly, the subset of African Americans who are voluntary internal migrants are better off than their less venturesome counterparts. Once this point is clear, it is easy to see why West Indian success offers no lessons for African American improvement.”[49]

Instead of just citing John’s summary of the spatial transferability hypothesis, it is better to cite all the studies looking at immigrants by country of origin. They invariably show the expected pattern for socioeconomic outcomes. Cognitive data is missing, so it cannot be used. However, I recently obtained permission to use the Danish draft test data, so we will see what I find.

Another factor which could possible cause a change in average cognitive ability score for a group (immigrant-derived or not) is if an increasing number of mixed-race individuals tend to identify themselves as members of this group (such as mixed Black-White individuals tending to identify themselves as Black). However, such a cognitive ability score change could reflect the average genetics of the group changing.

This is possible to investigate if one can get a sample with both generation and admixture. For instance for US Hispanics. Such a dataset may exist.

The g factor (general factor) is often seen as the underlying general mental ability that is measured more or less well by different cognitive tests. A test’s g loading, or a subtest’s g loading, refers to how well it correlates with the g factor. The g factor has been argued, based on evidence such as twin studies, to be largely genetically determined. Black-White IQ differences are largest on those tests and subtests having the highest g loadings. This has also been seen as evidence for a genetic explanation.[10] See also the sections “Spearman’s hypothesis” and “Significance of IQ and g“.

They need to cite a review of heritability of cognitive ability. Also I would avoid the phrasing “genetically determined” to avoid fueling the fire of people who hate genetic determinism — a position pretty much no researcher believes in.

The Flynn effect refers to the observed worldwide slow increase in average IQ test scores. A variety of explanations have been proposed such as increased familiarity with taking tests (thus not reflecting genuine intelligence changes) and explanations such as improved nutrition during childhood which affects the developing brain (thus reflecting genuine intelligence changes). This has been seen as evidence for that IQ can be changed significantly by environmental factors and that the racial IQ gaps may eventually disappear. However, at least in the US and some other developed nations the gains from the Flynn effect correlate negatively with g loadings and inbreeding depression. This argued to show that that at least in those countries the environmentally caused Flynn effect will not significantly narrow the largely genetically determined Black-White IQ gap.[15] A 2013 meta-analysis concluded that “It appears that the Flynn effect and group differences have different causes.”[50]

There is a newer meta-analysis with the Jensen method and the Flynn effect.

Woodley, M. A., te Nijenhuis, J., Must, O., & Must, A. (2014). Controlling for increased guessing enhances the independence of the Flynn effect from g: The return of the Brand effect. Intelligence, 43, 27-34.

Regression to the mean refer to the tendency for an exceptional result, such as getting all 6s or all 1s when rolling several dice, to be followed by less exceptional results (regression) that is closer to the average (mean) result. Hereditarians argue that the relatives of Blacks and White with exceptional IQs, low or high, will show predictable differences in regression due to the Black and White populations having different genetically determined average IQs. These predictions are argued to be confirmed. The same effect may possibly happen also due to environmental factors that behave similarly to IQ genes but hereditarians argue that this is unlikely and no such factors have been presented.[10][4]

I don’t know why regression phenomena would be evidence in favor of genetic models. See also Brody discussion:

Brody, N. (2003). Jensen’s genetic interpretation of racial differences in intelligence: Critical evaluation. The scientific study of general intelligence: Tribute to Arthur R. Jensen, 397-410.

The reaction time in response to a stimuli can be measured and tested on a variety of different tasks. Differences in reaction time are argued to be due to neurophysiological differences in the brain’s ability to process information which is also what IQ tests measure. Due to the unusual nature of the testing it is unlikely to be influenced by practice or education. Non-hereditarians have dismissed it as having a low and uncertain correlation with IQ. The related measure decision time has been similarly dismissed. Hereditarians have argued that this only applies when only one task is used. When the results from different tasks are combined, as is also done in IQ tests, the correlation between IQ and reaction time is 0.6-0.7. Although not all studies completely agree, overall racial differences are found consistent with those from IQ testing. Just as for IQ these racial differences are largest on the tasks that best measure the g factor.[10]

Primary lit. needed. In this case, they can be found via Jensen’s 2006 book:

Jensen, A. (2006). Clocking the Mind.

Hereditarians have argued that brain size is highly heritable and when reviewing numerous studies, then indirectly measured brain size (such as from skull measurements) have a 0.2 correlation with IQ and MRI measured brain size have a 0.4 correlation with IQ. If using the method of correlated vectors to distill g from the subtests of an IQ test, then the correlation was on average 0.63. One study found a correlation of 0.89 between g loadings and number of gray matter clusters.[53]

This review is outdated now. The numbers were artificially high due to publication bias. The best uncorrected estimate is about .25. Perhaps .30 after correction. Note however that overall volume may not be the best measure to use. Cortical surface seems to be better. Better yet, one can combine multiple measures and get a much better estimate.

Pietschnig, J., Penke, L., Wicherts, J. M., Zeiler, M., & Voracek, M. (2014). Meta-Analysis of Associations Between Human Brain Volume And Intelligence Differences: How Strong Are They and What Do They Mean?. Available at SSRN 2512128.

Ritchie, S. J., Booth, T., Hernández, M. D. C. V., Corley, J., Maniega, S. M., Gow, A. J., … & Deary, I. J. (2015). Beyond a bigger brain: Multivariable structural brain imaging and intelligence. Intelligence, 51, 47-56.

Differences in brain size between different species are associated with differences in various musculo-skeletal traits. This can partly be explained as adaptations to an increasingly larger brain. The same musculo-skeletal differences are seen between different human races which has been argued to further support the existence of brain size differences and also make an environmental explanation much more difficult.[56][57]

This I have never heard of. I will research that.

See also the research by Heitor Fernandes.

Critics have pointed to the average brain size differences between men and women and argued that there is no IQ difference which would indicate that brain size is an unreliable measure. Hereditarians have argued that some recent studies do have shown small average IQ differences between the sexes and that regardless there are clearly proven large differences between the sexes regarding narrower abilities. For example, men have on average greater spatial abilities which may be very computationally demanding and require the on average larger male brain areas. Thus, the average brain size differences between the sexes are argued to be significant regarding cognitive differences.[10]

It’s a weak argument. It’s possible for there to be a gender interaction for volume size that does not interfere with the cross-population pattern.

Also, men have greater chronometric abilities.

Another criticism is that the racial brain size differences only can explain a small part of racial IQ differences since the correlation between brain size and IQ in individuals is argued to be relatively low and that calculating the “explained variance” gives an even lower value. One response is that the genes affecting IQ must not necessarily affect brain size but may likely have various effects including non-brain size effects such as on neuronal connections, neuronal metabolism, neuronal insulation, the biochemical environment surrounding neurons, and so on. But even if racial brain size differences only explain a small part of the racial IQ differences, then this may still be very problematic for a 100% environmental theory.

There is at least one study showing that there is genetic correlation for brain size and IQ scores.

Posthuma, D., De Geus, E. J., Baaré, W. F., Pol, H. E. H., Kahn, R. S., & Boomsma, D. I. (2002). The association between brain volume and intelligence is of genetic origin. Nature neuroscience, 5(2), 83-84.

This is an old article. Check the citations of it to see if there are newer studies.

Direct genetic evidence basically requires two things. One is knowledge about how genes are distributed in different races. This is being rapidly achieved by continued technological developments which have dramatically lowered the costs for analyzing a person’s DNA. Several science projects have already completed analyzing or are in the process of analyzing the DNA of persons from different populations worldwide. One example is the 1000 Genomes Project which is analyzing the DNA of 2600 people from 26 different populations worldwide. 1700 had been completed as of March, 2012 and made publicly available.[64]

There is also ALFRED, used by Piffer often.

The 2015 book The Nature of Race stated that this “model has interesting theoretical and empirical support. Regarding theory, in (at least some) non-human species, climate is associated with between population variation in cognition, brain size, and heritable neural functioning (see, for example: Roth et al., 2010; Roth et al., 2012; Roth et al., 2013); cold evolved populations are, apparently, sharper. For humans, models which assume a simple relationship between selection conditioned on cognitive ability and climatic harshness over the last 60,000 years reasonably predict current global cognitive capital (see: Hart, 2007; relatedly: Grall, 2012). Regarding empirical findings, climate by way of cranial size explains a non-trivial portion of the National IQ variance (see: Meisenberg and Woodley, 2013). Generally, cognitive and cognitively related somatic differences are in agreement with the cold weather model; this model is also in agreement with the literature regarding other species.”[18]

Again, citing primary lit. is needed.

New genetic mutations, which could include ones causing higher IQ, would be more likely to arise in large populations. This can be combined with the cold temperature theory. This would explain why Arctic People who live in very cold areas, but only as small populations, did not evolve a very high IQ. Europeans and East Asian who had large populations and lived in relatively cold areas evolved a higher IQ.[3]

Plausible, but a simulation should be done to check this hypothesis. The difficulty is presumably obtaining reliable population count/density data for older times.

Studies in several countries have found that IQ vary on a north–south gradient inside the countries as predicted by the climate theories.[81][82][83]

Two counties are mentioned, one of them twice. However, it has also been found in some other counties, such as Vietnam.

Also, it was not found in India. I don’t recall whether it was found in China or not.

humanvarieties.org/2014/06/19/hvgiq-vietnam/ r=.33

Another is that in relatively recent European and East Asian societies three key elements are argued to have existed: 1. Class differences that reflect differences in intellectual performance. 2. A higher level of reproductive success in higher social classes than in lower ones. 3. No barriers to downward social mobility. The lower classes are argued to have been gradually replaced by people of higher social origin (and IQ). This is argued to have caused a relatively recent increase in IQ in these societies. Other societies are argued to have lacked stratification, or been too rigidly stratified, or favoring other characteristics than IQ as causes of social mobility.[85]

Primary lit. Cite Clark’s book, not Frost’s blogpost.

The g factor (general factor) is the underlying general mental ability that is measured more or less well by different cognitive tests The existence of g does not exclude the existence also of narrower cognitive abilities (that correlate with one another and g as explained in the IQ article). g has sometimes been criticized for reasons such as being a claimed statistical artifact. Supporters argue that such attempted criticisms have a long history but that all have ultimately failed. For example, there have been many unsuccessful attempts to find important forms of intelligence that do not correlate with g. Furthermore, g is argued to have a very high genetic heritability, to be unchanged by training such as taking repeated IQ tests, to account for almost of all of the predictive ability of cognitive tests, and findings in neurobiology “establish a biological basis for g that is firmer than that of any other human psychological trait”. Furthermore, successfully discrediting g as a statistical artifact would change the situation much less than some may expect. Different races would continue to systematically differ on numerous different tests of various correlated cognitive abilities as well as to continue to systematically differ on various correlated life outcomes and achievements and there would still continue to be correlations between cognitive tests and life outcomes and achievements.[87][51]

The article needs to be rewritten in various places. Genetic heritability? It’s a tautology.

Also the anti-hereditarian IQ researcher James Flynn has rejected attacking g in order to discredit racial IQ differences and has stated regarding Stephen Jay Gould’s book The Mismeasure of Man (which attacked the hereditarian Arthur Jensen) that “Gould’s book evades all of Jensen’s best arguments for a genetic component in the black-white IQ gap, by positing that they are dependent on the concept of g as a general intelligence factor. Therefore, Gould believes that if he can discredit g, no more need be said. This is manifestly false. Jensen’s arguments would bite no matter whether blacks suffered from a score deficit on one or 10 or 100 factors.”[51]

Flynn seems to have changed his mind. See the recent discussion with Gottfredson and Turkheimer:



Differences regarding average IQ is one explanation for differences between different regions regarding early achievements such as the creation of civilizations. However, non-hereditarians have pointed out that the earliest civilizations often occurred in regions which today do not have a very high average IQ. One response is that hereditarians have never argued that IQ (or genetics) is the only explanation for differences between human groups. The earliest civilizations occurred in very fertile river valleys which at the early stages of technological development likely were the only regions which allowed the high population density necessary for the development of civilization. In contrast, factors such as the long, harsh winters and the very hard clay soils in Northern Europe for a long time prevented a high population density. Technology had to advance greatly before this changed, one example being that the technologically much more advanced heavy plough necessary to take advantage of such clay soils was first introduced in the Medieval period.[88][3]

Regarding history and HBD, the best source is:

Hart, M. H. (2007). Understanding human history: An analysis including the effects of geography and differential evolution. Washington Summit Publishers.

Morphological evidence based on for example statues, paintings, skeletal remains, or mummies have often been used to argue that various ancient populations or cognitive elites were quite racially different from the current populations living in the same area. Even if historical records are available, practices such religious conversion are often accompanied by changing personal and family names which often makes it difficult to identify the correct race from the name of persons. The earlier presence of racially different cognitive elites or internal dysgenetic changes in a population would likely be very difficult to detect by genetic studies of the current populations living in these areas.

No references given.

Another question is why areas such as the Sub-Saharan Africa never developed a civilization despite technology and civilizations eventually spreading to many parts of Eurasia and there being many contacts with for example the Egyptian, the Roman and the Islamic civilizations. In contrast, Amerindians (including the Maya who lived in a jungle region) eventually created civilizations (although lacking in several key aspects) despite having no contact with the civilizations and technology in Eurasia.[92]

The claims are too strong. There were pre-colonial civilizations in Africa.


Race differences in intelligence can explain why most conquests of people(s) by another people throughout human history have involved a northern people conquering southern people(s). This despite the northern regions usually being less populous due to harsher climates. For example, China was never threatened by southerners but repeatedly conquered by northerners. Similarly India was repeatedly conquered by northerners. Europeans conquering various southern peoples but not East Asians, etc.[93]

This sounds like a claim that needs to be quantified.

Lynn has argued that East Asians, despite having slightly higher average IQ, have produced much less creative discoveries and innovations in the arts and sciences than Europeans. One possibly explanation is that East Asians have lower average creativity than Europeans. Lynn argues that this is supported by Northeast Asians scoring lower on the personality trait openness to experience.[94]

Better source needed. There are country-level personality measures, but they aren’t very good.

Meisenberg, G. (2015). Do We Have Valid Country-Level Measures of Personality?. Mankind Quarterly, 55(4), 360-382.

A significant part of the debate following The Bell Curve was regarding how important the IQ group differences were for the future achievements of the groups in the US. The book argued for the strong importance of IQ for numerous factors such as future educational achievements, employment, income, divorce rates, and crime.

A summary of the findings are in order. One could e.g. use my image summaries of the book’s findings.


Furthermore, a high IQ has been argued to be more important for group outcome differences than for individual outcome differences. A 2011 study presented “four different channels through which intelligence may matter more for nations than for individuals: (i) intelligence is associated with patience and hence higher savings rates; (ii) intelligence causes cooperation; (iii) higher group intelligence opens the door to using fragile, high-value production technologies; and (iv) intelligence is associated with supporting market-oriented policies.”[95]

A claim which is strongly born out by my studies of group-level correlates for immigrants in north European countries and for regions and states. E.g. correlation of national S and IQ is .86 (weighted using sqrt(pop), identical without weights).

There are also some countries that are racially different from the surrounding countries and that perform very differently from these surrounding countries on various variables. Examples include Haiti (predominantly Sub-Saharan Africans) and Singapore (predominantly East Asians).

Israel is another example.


In general, the article is fairly comprehensive, but lacks many primary sources. Right now it is mostly a summary of claims made by Rushton and Jensen’s two review articles, Lynn’s books and Fuerst’s book. The text needs a proofreading as there are a number of confusing uses of terminology as well as many errors. The writer is not a native.

I hope that my comments here can lead to an improved article. Or better, someone should write an updated review of the race and intelligence question. Since a lot of research has been done in the last year or so by Fuerst, Dalliard, Malloy, Hu and myself, it would probably have to be written by one or more of us.


I review recent findings in human behavioral genetics and their implications for selective breeding and estimation of genotypic racial differences in polygenic traits.

1. Polygenic scores from all SNPs vs. p<α SNPs

A recent paper (1) used polygenic scores derived from the Rietveld results (2) to score a non-overlapping sample of European Americans (EA) and African Americans (AA). They found that polygenic scores predicted educational outcomes for samples at r’s = .18 and .11 for EAs and AAs respectively. In terms of variance, this corresponds to 3.24% and 1.21%, respectively. This is small, but not useless. They don’t report confidence intervals, only p value inequalities, so it isn’t so easy to see how precise these estimates are (3). The p value inequalities for the two results are p<.001 and <0.01. Note that sample sizes are different too. The main results table is shown below.

These findings are interesting because they use polygenic scores instead of scores derived from just the findings that surpass the NHST threshold, i.e. those that have a p value below the alpha value (p<α).1 Using the full set of betas instead of just the set with p<α results in better predictions. It has even been found that differential weighting of the SNPs does not have a major effect of the predictive power of the deriving polygenic scores (4).

This should be seen in light of conceptually related results in psychometrics where it has been shown that it doesn’t matter much if one uses g factor scores, simple sums or even randomly weighted subtests (5). The general mathematical explanation for this is that when one creates a linear combination (i.e. adds together) many variables, the common variance (‘the signal’) adds up while the unshared variance (‘the noise’) does not. Thus, the more variables one averages, the more more signal in the noise (simplifying a bit). The general idea goes back at least to 1910, when Spearman and Brown independently derived a formula for it (6). Their papers were published in the same journal, even in the same issue (7,8). Another example of multiple discovery/invention.

Focusing on the number of SNPs with p<α for a trait is the wrong metric to think of. One should instead think of the found correlation (or other effect size measure if outcome data is categorical) between polygenic scores and outcomes for cross-validation samples. Thinking of SNPs where p<α is dichotomous thinking instead of continuous thinking. When using dichotomous models for phenomena that really is continuous, one will get threshold effects that bias the effect sizes downwards.

2. Inconvenient results can be made to go away (maybe)

Since the study contained both a EA and an AA sample with mean IQs of 105.1 and 94.3, it should be possible to derive polygenic scores for members of both groups and compare the mean of the groups. This would be a test of the genetic hypothesis for the well known cognitive ability difference between the groups (9–11). There are two things worth noting, however.

First, the group difference is only 10.8 IQ, smaller than the usual gap found. There is some question as to whether the gap has been changing over time (12,13). Some newer samples find smaller gaps especially those based on WORDSUM scores (14), while others find standard (~1 SD, 15 IQ) sized gaps (15). The smaller than expected gap in the samples may result from selection bias in the AA sample (presumably it is difficult to recruit very low S inner city AAs for scientific studies). Note that results are generally weaker for this sample, which is expected given restriction of range.

Second, not all persons in the groups were genotyped. Those that were had lower mean IQs of 103.9 and 91.6, respectively. This gives a gap of 12.3 points. Note that the reason the EA score is not 100, is that the overall Add Health sample mean is set to ~100 (100.6).

Despite these caveats, the polygenic scores would be interesting to see. However, the authors decided to standardize the results within each group, such that the mean of the polygenic scores was 0 for both groups. The of course makes any group difference impossible to see. They provide the following rationale:

The 917 European Americans (EAs) in our analytic sample are in 386 sibling pairs and 12 sibling trios, with an additional 109 singletons. The 677 African Americans (AAs) are in 100 sibling pairs and four trios, with an additional 465 singletons. Table 1 shows characteristics of the EA and AA sibling pairs study participants who provided genetic data and constitute our analytic sample. The table also shows characteristics of the full Add Health EA and AA samples for comparison. The EAs in our analytic sample are largely comparable to the full population of EA respondents in the Add Health study. The AAs in our sample are less educated, have less educated parents, and score lower on the verbal intelligence measure as compared to all AA Add Health participants. The bulk of our analysis is focused on the EA sample because the original Rietveld et al. (2013) GWAS was conducted on European-descent individuals. Replication of polygenic scores discovered in EA samples among AA samples may be compromised because LD differences in the groups lead to less precision among AA samples. Accordingly, large-scale GWASs of educational attainment in African Americans will be needed to better quantify genetic influences on attainment in this population. Nevertheless, in the interest of testing the extent to which findings made in European-descent individuals replicate in a different population, we conduct several analyses of the AA sample. Due to the small number of AA sibling pairs in the data, sibling analyses are conducted only in EAs.

The rationale is not entirely unreasonable, but not sufficient reason not to standardize the polygenic scores for both samples together. In my opinion, the reason they provide should be taken into account when interpreting the results, but is not sufficient for not showing the results. My guess is that they did calculate the scores for both groups and compared them. Upon finding that the AA sample had a lower mean polygenic score than the EA sample, they decided that result was too toxic to publish. Reverse publication bias in effect. See also this post. A respected academic acquaintance of mine contacted the authors but they refused to share the results.

Lastly, one can use the combined sample to investigate whether the data shows a Simpson’s paradox pattern. The lack of a such pattern is a central finding of Fuerst’s and my upcoming paper (16). Jensen’s default hypothesis (17) predicts the absence of such a pattern since the same genetic causes are postulated to be involved in the within race differences as those between them.

3. Polygenic scores and sibling pairs

Another interesting aspect of the study is that they have sibling data. Since siblings receive a random mix of genes from their parents, they will differ in their genotypic for polygenic traits. This was also found in this sample: “The mean sibling difference in polygenic scores in the EA sample was 0.8.” (they did not calculate this for the AA sample, stating that it was too small). In other words, the difference between siblings is nearing the size of the mean difference in the whole population. The same result is known to be true for siblings and IQ scores. The mean difference is about 11 IQ compared to a full sample mean difference of 17 IQ (17). This gives a ratio of 11/17 = .65. Since the educational attainment data is standardized, we know that the mean difference in scores is 1.13 (Fuerst posted the formula here, but I’m not sure about the source). This gives a ratio of 1.13/.8 = .71. These ratios are pretty close as they should be.

We care about sibling comparisons because they by design control for shared environment effects, so that we don’t need to control them statistically (18). The authors found that results held within sibling pairs, an important finding. The table below is from their paper:

As we will see below, this has another important practical implication.

4. Genetic engineering and causal variants

Since socially valued outcomes have non-zero heritability (19), it means that it is in theory possible to improve the outcomes by genetic means, just as we have done for animals. I see two main routes to do this: selection among possible children and direct editing.

The first method is widely used but so far only for a small number of traits. When two persons want a child, the usual method involves having sex and producing a fetus. As mentioned above, this fetus will have a random combination of genes from the parents. If the same parents produce a different combination we call it a sibling.

Selective abortion involves screening fetuses in the womb for anomalies and aborting ones sufficiently undesirable. Probably the most common target for this is Down’s syndrome, which is substantially reduced due to the high rate of abortions when it is detected (20,21). For Denmark, the abortion rate given detection is 99%.

Selective abortion is better than nothing but it is not a good method. Not only is it painful for the woman, but it is inefficient because one has to wait until one can perform a prenatal screening. At that point, the fetus is many weeks old. Furthermore, abortions can result in infertility.

Embryo selection is the natural extension of the same idea. Instead of selectively aborting fetuses, we select between embryos (fertilized eggs). Essentially, we choose an embryo and implant it. This illustration shows how this works.

The second and best option for genetic engineering is to edit the genes directly. In that we one could potentially create a genome free of known flaws. This would involve using something like CRISPR.

The problem with direct editing is that we need to know the actual causal variants. This is not required for selection among possible children. Here is it sufficient that we can make predictions. The difference here is that the SNPs we know are in most cases probably not the causal variants. Instead, they are proxies for the causal variants because the are in linkage disequilibrium (LD) with them. In simple terms, the reason for this is that the mixing of gene variants from sexual reproduction (meiosis) happens at random, but in chunks. Thus, gene variants that are located closer to each other in the genome tend to travel together during splits. This means that they get correlated, which we call LD.

Since practical use of embryo selection requires working on sibling embryos, it is necessary that we can make genomic predictions among siblings that work. The new paper showed that we can do this for educational attainment.

5. Validity of GWAS results across racial groups

There are two matters. The first is to which degree the genetic architecture of polygenic traits is similar across racial groups, i.e. if the same genes cause traits across populations or if there is substantial race-level gene-gene interaction (epistasis). The second is the degree to which SNP betas derived from one race can be used to make valid predictions for another race.

For polygenic traits that have been under the selection for many thousands of years (e.g. cognitive ability or height (22)), I think substantial race-level gene-gene interaction is implausible. They are however plausible for traits that involve a small number of genes and show substantial race differences, such as those for hair, eye and skin color.

LD patterns change over time. Since the LD patterns change independently and randomly in each population, they will tend to become different with time.

If the GWAS SNPs owe their predictive power to being actual causal variants, then LD is irrelevant and they should predict the relevant outcome in any racial group. If however they owe wholly or partly their predictive power to just being statistically related to causal variants, they should be relatively worse predictors in racial groups that are most distantly related. One can investigate this by comparing the predictive power of GWAS betas derived from one population on another population. Since there are by now 1000s of GWAS, meta-analyses have in fact made such comparisons, mostly for disease traits. Two reviews found substantial cross-validity for the Eurasian population (Europeans and East Asians), and less for Africans (usually African Americans) (23,24). The first review only relied on SNPs with p<α and found weaker results. This is expected because using only these is a threshold effect, as discussed earlier.

The second review (from 2013; 299 included GWAS) found much stronger results, probably because it included more SNPs and because they also adjusted for statistical power. Doing so, they found that: ~100% of SNPs replicate in other European samples when accounting for statistical power, ~80% in East Asian samples but only ~10% in the African American sample (not adjusted for statistical power, which was ~60% on average). There were fairly few GWAS for AAs however, so some caution is needed in interpreting the number. Still, this throws some doubt on the usefulness of GWAS results from Europeans or Asians used on African samples (or reversely).

Which brings us back to…

6. Low cross-validity of GWAS betas and polygenic scores for educational attainment in AAs

Despite the relatively weak evidence for European sample derived GWAS betas in Africans, the study mentioned in the beginning of this review (1) still found a reliable polygenic correlation of .11 in AAs. However, AAs are an admixed group that are about 75-85% African and 25-15% European (25,26). The exact admixture proportions depend on the selectivity of the sample. Bryc et al used the 23andme database which represents individuals willing to pay to have their genomes sequenced. Since this requires both money (price is about 100$ for US citizens) and interest in genetic results, this will lead to selection for S (27) and cognitive ability. Both traits are known to correlate with European admixture at the individual, region and country levels (16), which would then result in higher proportions of European admixture in AA sample. Shriver et al’s sample is more representative and found mean proportions of 78.7% and 18.6% for African and European ancestry respectively.

If we make the assumption that the polygenic correlation for educational attainment in the AA sample is purely due to the European admixture, we can make a prediction for the effect size, namely that it should be about 20% of the size of that for Europeans. I’m not sure but I think that in this case one should use the proportion of variance, not correlation coefficient. Recall that these were 3.24% and 1.21% (r’s .18 and .11), which gives a ratio of .37. This is higher than the expected value of .186. This means that there is an excess validity of .187 in the African part of their genome under the null model. We can use this to make an estimation of the cross-racial validity. Since we have accounted for AAs European admixture, the rest of the predictive power must come from the African admixture (ignoring Native American admixture for simplicity), which constitutes 78.7%. This gives an estimated cross-racial validity ratio of about .24 (0.187/.787). In a pure African sample, this corresponds to an estimated correlation coefficient of .09 (sqrt(.182 * .24)). Future studies will reveal how far off these estimates are, but most importantly, they are quantitative predictions, not merely qualitative (directional) (28).

7. Poor African-Eurasian cross-validity and the Piffer method

The findings related to the relatively poor, but non-zero cross-validity of GWAS betas between European and African samples throw some doubt on the SNP evidence found by Piffer in his studies of the population/country IQ and cognitive ability SNP factors (29). If the betas for the SNPs identified in European sample GWAS do not work well as predictors for Africans, they would be equally unsuitable for estimating mean genotypic cognitive ability from SNP frequencies. Thus, further research is needed to more precisely estimate the cross-racial validity of GWAS betas, especially with regards to African vs. Eurasian samples.


1. Domingue BW, Belsky DW, Conley D, Harris KM, Boardman JD. Polygenic Influence on Educational Attainment. AERA Open. 2015 Jul 1;1(3):2332858415599972.

2. Rietveld CA, Medland SE, Derringer J, Yang J, Esko T, Martin NW, et al. GWAS of 126,559 Individuals Identifies Genetic Variants Associated with Educational Attainment. Science. 2013 Jun 21;340(6139):1467–71.

3. Cumming G. The New Statistics Why and How. Psychol Sci. 2014 Jan 1;25(1):7–29.

4. Kirkpatrick RM, McGue M, Iacono WG, Miller MB, Basu S. Results of a “GWAS Plus:” General Cognitive Ability Is Substantially Heritable and Massively Polygenic. PLoS ONE. 2014 Nov 10;9(11):e112390.

5. Ree MJ, Carretta TR, Earles JA. In Top-Down Decisions, Weighting Variables does Not Matter: A Consequence of Wilks’ Theorem. Organ Res Methods. 1998 Oct 1;1(4):407–20.

6. Carroll JB. Human cognitive abilities: A survey of factor-analytic studies [Internet]. Cambridge University Press; 1993 [cited 2015 Jun 3]. Available from: www.google.com/books?hl=en&lr=&id=i3vDCXkXRGkC&oi=fnd&pg=PR7&dq=Carroll,+1993+human+cognitive+abilities&ots=3b3O4R_IKc&sig=wOss3EHXu37Q3_OZV9Due_3wyFg

7. Spearman C. Correlation Calculated from Faulty Data. Br J Psychol 1904-1920. 1910 Oct 1;3(3):271–95.

8. Brown W. Some Experimental Results in the Correlation of Mental Abilities1. Br J Psychol 1904-1920. 1910 Oct 1;3(3):296–322.

9. Fuerst J. Ethnic/Race Differences in Aptitude by Generation in the United States: An Exploratory Meta-analysis. Open Differ Psychol [Internet]. 2014 Jul 26 [cited 2014 Oct 13]; Available from: openpsych.net/ODP/2014/07/ethnicrace-differences-in-aptitude-by-generation-in-the-united-states-an-exploratory-meta-analysis/

10. Rushton JP, Jensen AR. Thirty years of research on race differences in cognitive ability. Psychol Public Policy Law. 2005;11(2):235–94.

11. Fuerst J. The facts that need to be explained [Internet]. Unwelcome Discovery. 2012 [cited 2015 Aug 31]. Available from: z139.wordpress.com/2012/06/10/the-facts-that-need-to-be-explained/

12. Fuerst J. Secular Changes in the Black-White Cognitive Ability Gap [Internet]. Human Varieties. 2013 [cited 2015 Aug 31]. Available from: humanvarieties.org/2013/01/15/secular-changes-in-the-black-white-cognitive-ability-gap/

13. Malloy J. The Onset and Development of B-W Ability Differences: Early Infancy to Age 3 (Part 1) [Internet]. Human Varieties. 2013 [cited 2015 Aug 31]. Available from: humanvarieties.org/2013/05/26/the-onset-and-development-of-b-w-ability-differences-early-infancy-to-age-3-part-1/

14. Hu M. An update on the secular narrowing of the black-white gap in the Wordsum vocabulary test (1974-2012) [Internet]. 2014 [cited 2015 Aug 31]. Available from: osf.io/hiuzk/

15. Frisby CL, Beaujean AA. Testing Spearman’s hypotheses using a bi-factor model with WAIS-IV/WMS-IV standardization data. Intelligence. 2015 Jul;51:79–97.

16. Fuerst J, Kirkegaard EOW. Admixture in the Americas. In London, UK.; 2015. Available from: docs.google.com/presentation/d/1hjhOiitk0MnqMHgVthyj8j7qa4qcAqPUDNaTT8rpetg/e dit?pli=1#slide=id.p

17. Jensen AR. The g factor: the science of mental ability. Westport, Conn.: Praeger; 1998.

18. Murray C. IQ and income inequality in a sample of sibling pairs from advantaged family backgrounds. Am Econ Rev. 2002;339–43.

19. Polderman TJC, Benyamin B, de Leeuw CA, Sullivan PF, van Bochoven A, Visscher PM, et al. Meta-analysis of the heritability of human traits based on fifty years of twin studies. Nat Genet. 2015 May 18;47(7):702–9.

20. Natoli JL, Ackerman DL, McDermott S, Edwards JG. Prenatal diagnosis of Down syndrome: a systematic review of termination rates (1995-2011): Prenatal diagnosis of down syndrome: systematic review. Prenat Diagn. 2012 Feb;32(2):142–53.

21. de Graaf G, Buckley F, Skotko BG. Estimates of the live births, natural losses, and elective terminations with Down syndrome in the United States. Am J Med Genet A. 2015 Apr;167A(4):756–67.

22. Joshi PK, Esko T, Mattsson H, Eklund N, Gandin I, Nutile T, et al. Directional dominance on stature and cognition in diverse human populations. Nature. 2015 Jul 23;523(7561):459–62.

23. Ntzani EE, Liberopoulos G, Manolio TA, Ioannidis JPA. Consistency of genome-wide associations across major ancestral groups. Hum Genet. 2011 Dec 20;131(7):1057–71.

24. Marigorta UM, Navarro A. High Trans-ethnic Replicability of GWAS Results Implies Common Causal Variants. PLoS Genet [Internet]. 2013 Jun [cited 2015 Aug 31];9(6). Available from: www.ncbi.nlm.nih.gov/pmc/articles/PMC3681663/

25. Bryc K, Durand EY, Macpherson JM, Reich D, Mountain JL. The genetic ancestry of African Americans, Latinos, and European Americans across the United States. Am J Hum Genet. 2015 Jan 8;96(1):37–53.

26. Shriver MD, Parra EJ, Dios S, Bonilla C, Norton H, Jovel C, et al. Skin pigmentation, biogeographical ancestry and admixture mapping. Hum Genet. 2003 Feb 11;112(4):387–99.

27. Kirkegaard EOW, Fuerst J. Educational attainment, income, use of social benefits, crime rate and the general socioeconomic factor among 71 immigrant groups in Denmark. Open Differ Psychol [Internet]. 2014 May 12 [cited 2014 Oct 13]; Available from: openpsych.net/ODP/2014/05/educational-attainment-income-use-of-social-benefits-crime-rate-and-the-general-socioeconomic-factor-among-71-immmigrant-groups-in-denmark/

28. Velicer WF, Cumming G, Fava JL, Rossi JS, Prochaska JO, Johnson J. Theory Testing Using Quantitative Predictions of Effect Size. Appl Psychol Psychol Appl. 2008 Oct;57(4):589–608.

29. Piffer D. A review of intelligence GWAS hits: their relationship to country IQ and the issue of spatial autocorrelation [Internet]. 2015 [cited 2015 Aug 2]. Available from: figshare.com/articles/A_review_of_intelligence_GWAS_hits_their_relationship_to_country_IQ_and_the_issue_of_spatial_autocorrelation_/1393160


1 For GWAS the alpha value is usually set at 5*10-8. The number comes from correcting the standard α=.05 (95% theoretical true positive rate) for multiple testing when using SNP data: .05 * 1e-6 = 5e-8.

Sometimes you need to use a function that wants a numeric matrix as input. One such function is glmnet.cv() which performs lasso regression with cross validation, which is very cool. Unfortunately, it is picky about how it wants the input data. Here’s some lines of my code:

fit_cv = cv.glmnet(x = as.matrix(temp_df[predictors]), #predictor vars matrix
                   y = as.matrix(temp_df[dependent]), #dep var matrix
                   weights = weights_, #weights
                   alpha = alpha_) #type of shrinkage

We see that x must be a matrix of the predictors, y must be a matrix with the dependent (usually just one), the weights and alpha are optional, but since I am working with aggregate data I am almost always using weights. Alpha controls the kind of shrinkage used.

All well and good, until it isn’t. In my case, the predictor data.frame contains some factor variables. R actually uses numeric values as its internal representation of these, but displays them with strings. For instance:

DF = data.frame(a = 1:3, b = letters[10:12],
                c = seq(as.Date("2004-01-01"), by = "week", len = 3),
                stringsAsFactors = TRUE)

Which prints out like this:

> DF
  a b          c
1 1 j 2004-01-01
2 2 k 2004-01-08
3 3 l 2004-01-15

However, suppose we use my as.matrix solution above, then we get:

> as.matrix(DF)
     a   b   c           
[1,] "1" "j" "2004-01-01"
[2,] "2" "k" "2004-01-08"
[3,] "3" "l" "2004-01-15"

Which is not what we wanted. It gave us a character matrix which glmnet.cv() will then throw a nonsensical error about. Their bad error made me spend some time finding the actual error. Save yourself and others time. Always write good error messages for functions that will be used more than a couple of times!

Is there some easy built in way to solve the problem?

> as.numeric(DF)
Error: (list) object cannot be coerced to type 'double'

The easiest solution did not work.

> as.numeric(DF$b)
[1] 1 2 3

However, it does work for a single column. So maybe we can just try using it on all the columns:

> apply(DF, 2, as.numeric)
     a  b  c
[1,] 1 NA NA
[2,] 2 NA NA
[3,] 3 NA NA
Warning messages:
1: In apply(DF, 2, as.numeric) : NAs introduced by coercion
2: In apply(DF, 2, as.numeric) : NAs introduced by coercion

What? What is going on?

> apply(as.matrix(DF), 2, as.numeric)
     a  b  c
[1,] 1 NA NA
[2,] 2 NA NA
[3,] 3 NA NA
Warning messages:
1: In apply(as.matrix(DF), 2, as.numeric) : NAs introduced by coercion
2: In apply(as.matrix(DF), 2, as.numeric) : NAs introduced by coercion

It looks apply() does a silent as.matrix() which then causes the NAs. OK. How do we convert just the factor columns then? Maybe try some of the more fancy built in conversion calls:

> as.matrix.data.frame(DF)
     a   b   c           
[1,] "1" "j" "2004-01-01"
[2,] "2" "k" "2004-01-08"
[3,] "3" "l" "2004-01-15"


> as.data.frame.matrix(DF)
  a b     c
1 1 j 12418
2 2 k 12425
3 3 l 12432

Closer, this time the date got converted, but the factor got converted to character, not integers. We could do a loop:

> for (col_idx in seq_along(DF)) {
+   DF[col_idx] = as.numeric(DF[[col_idx]])
+ }
> DF = as.matrix(DF)
> str(DF)
 num [1:3, 1:3] 1 2 3 1 2 ...
 - attr(*, "dimnames")=List of 2
  ..$ : NULL
  ..$ : chr [1:3] "a" "b" "c"

Which works, but now it is getting silly. Maybe some implicit loops:

> lapply(DF, as.numeric)
[1] 1 2 3
[1] 1 2 3
[1] 12418 12425 12432

Closer, but it returns a list, not a matrix. Maybe just try converting:

> as.matrix(lapply(DF, as.numeric))
a Numeric,3
b Numeric,3
c Numeric,3

But no no, life isn’t that easy. What about as.data.frame?

> as.data.frame(lapply(DF, as.numeric))
  a b     c
1 1 1 12418
2 2 2 12425
3 3 3 12432

Huh, that works, but as.matrix didn’t. Oh well, just one final step:

> DF = as.matrix(as.data.frame(lapply(DF, as.numeric)))
> str(DF)
 num [1:3, 1:3] 1 2 3 1 2 ...
 - attr(*, "dimnames")=List of 2
  ..$ : NULL
  ..$ : chr [1:3] "a" "b" "c"

We got what we wanted!

Sometimes, R does not make your life easy.


The book is on Libgen (free download).

Since I have ventured into criminology as part of my ongoing research program into the spatial transferability hypothesis (psychological traits are stable when people move around, including between countries) and the immigrant groups by country of origin studies, I thought it was a good idea to actually read some criminology. So since there was a recent book covering genetically informative studies, this seemed like a decent choice, especially because it was also available on libgen for free! :)

So basically it is a debate book with a number of topics. For each topic, someone (or a group of someones) will argue for or explain the non-genetic theories/hypotheses, while another someone will sum up the genetically informative studies (i.e. behavioral genetics studies into crime) or at least biologically informed (e.g. neurological correlates of crime).

Initially, I read all the sociological chapters too until I decided they were a waste of time to read. Then I just read the biosocial ones. If you are wondering about the origin of that term as opposed to the more commonly used synonym sociobiological, the use of it was mostly a move to avoid the political backslash. One of the biosocial authors explained it like this to me:

In terms of the name biosocial (versus sociobiological), I think the name change happened accidentally. But there was somewhat of a reason, I guess. EO Wilson and sociobiological thought was so hated amongst sociologists and criminologists, none of us would have gotten a job had we labelled ourselves sociobiologists. Though it was no great secret that sociobiology gave birth to our field. In some ways, it was purely a semantic way to fend off attacks. Even so, there are some distinctions between us and old school sociobiology (use of behavior genetic techniques, etc.).

The book suffers from the widespread problem in social science of not giving effect size numbers. This is more of a problem for the sociological chapters, but true also for the biosocial ones. If no effect sizes are not reported, one cannot compare the importance of the alleged causes! Note that behavioral genetics results inherently include effect sizes. The simplest ACE fitting will output the effect sizes for additive genetics, shared environment and unshared environment+error.

Even if you don’t plan to read much of this, I recommend reading the highly entertaining chapter: The Role of Intelligence and Temperament in Interpreting the SES-Crime Relationship by Anthony Walsh, Charlene Y. Taylor, and Ilhong Yun.

From the interactive visualization I previously published to give foster an intuitive understanding of the concept:

Tail effects are when there are large differences between groups at the extremes (tails) of distributions. This happens when the distributions differ in either the mean or the standard deviation (or both), even when these differences are quite small. Below we see a density plot of two normal distributions with different means as well as a threshold value (vertical line). The table below the plot shows various summary statistics about the distributions with regards to the threshold. Try playing around with the numbers on the left and see how results change.

One of the pleasures of reading a very broad selection of science is that one discovers connections between fields that are not commonly connected. Sometimes these connections may give rise to important new inter-disciplinary fields or understanding, sometimes it just gives you a nice feeling of seeing the same concept in different circumstances.

Tail effects are often discussed in differential psychology because of the continued interest in group differences. These can be in whatever trait: cognitive, interest, emotional, personality-wise, and with whichever groups: social, economic, gender or racial. However, tail phenomena is more general than group differences, the two distributions can be any kind of data, including the same data from different times.

In the last year or so I have taken an increased interest in climate science. The reason is basically this, and that science denialism annoys me and incentivizes me to explore a new area of science. In fact, the whole reason I got started on science in general was that I was debating with creationists on a forum. Debating creationists effectively actually requires a fairly broad knowledge of science and philosophy. One must understand enough physics and chem. to explain how radiometric dating works, enough cosmology to explain facts related to the big bang, enough geology to explain plate tectonics, enough geology and paleontology to explain the distribution of fossils, enough evolutionary biology and genetics to explain the general ideas of evolution, and finally enough philosophy (logic, critical thinking, epistemology, philosophy of language) to spot logical errors, language and debating tricks. It isn’t exactly easy. Just listing all these areas took a number of minutes, reading the Wikipedia articles, will take many hours.

Anyway, since I had been debating climate skeptics recently on a Danish-language conservative-libertarian-nationalist news aggregator, I have encountered many odd claims, which require a fairly deep knowledge of various various of climate science. E.g. to explain the facts related to Climategate, one needs an idea of temperature reconstruction with tree rings and their dating, scientific graphing, and the methods used to combine the data (principal components analysis, another method from psychometrics :) ). An issue that sometimes comes up is extreme weather. Since there was some discussion of this, including the very definition, I decided to find a review article:
Zwiers, F. W., Alexander, L. V., Hegerl, G. C., Knutson, T. R., Kossin, J. P., Naveau, P., … & Zhang, X. (2013). Climate extremes: challenges in estimating and understanding recent changes in the frequency and intensity of extreme climate and weather events. In Climate Science for Serving Society (pp. 339-389). [odd journal name, but paper seems decent]

The following visual explanation of extreme weather is found in the paper:

Climate Extremes Challenges in Estimating and Understanding Recent Changes in the Frequency and Intensity of Extreme Climate and Weather Events tail effects climatescience

Which showcases the general applicability of the concept. :)

From Reddit. www.reddit.com/r/psychology/comments/3hktp8/try_phacking_for_yourself_the_process_of_fishing/cu8rexf?context=3

 [–]jufnitz 2 points

Not a bad place to plug the excellent, accessible, and snarky Statistics Done Wrong, which explains this problem with p-values along with many other, similarly rampant statistical missteps in natural and social science.

Also too, this:

Science is not a magic wand that turns everything it touches to truth. Instead, “science operates as a procedure of uncertainty reduction,” said Nosek, of the Center for Open Science. “The goal is to get less wrong over time.” This concept is fundamental — whatever we know now is only our best approximation of the truth. We can never presume to have everything right.

…is what philosophers of science have been saying for decades now, contrary to the lay tendency to invest in science the social and ideological capital once held by organized religion. Sure there are scientists who encourage this tendency, but this isn’t necessarily distinguishable from rank self-interest fit for the most soulless corporate executive.


Lewontin is not the right person to cite for skepticism. He is best known for his fallacy with regards to racial divisions.

[–]jufnitz 1 point

Among the general population of working scientists, Lewontin is “best known” for his groundbreaking contributions to evolutionary biology and evolutionary genetics. The circles among which he’s “best known for his fallacy with regards to racial divisions” tend to be those of scientific eugenics and lay racism.


His Wikipedia page mentions his anti-determinism work/activism in the introduction section too. :)

You can also look at his citations: scholar.google.dk/scholar?hl=en&q=lewontin&btnG=&as_sdt=1%2C44&as_sdtp=

His #1 citation is his famous paper against the adaptionist program, i.e. part of his anti-determinism work. His #2 fits what you are saying. His #3 is again part of his general behavior genetics/differential psychology denialism. #4 concerns the very question of how to apportion human diversity. #5 is his communist biology manifesto (didn’t we try that once before?). And so on.

You can also look at his publications the last 10 years, they seem to be exclusively or mostly about his denialism project, i.e. political activism. He states this himself in his own books. See also Defenders of the truth.

So yeah, he is mostly known for his political activist biology. He did some real work a long time ago, for which he is still rightfully known to population geneticists.

What is age heaping?

Number heaping is a common tendency of humans. What this means is that we tend round numbers to the nearest 5 or 10. Age heaping is the tendency of innumerate people to round their age to the nearest 5 or 10, presumably because they can’t subtract to infer their current age from their birth year and the current year. Psychometrically speaking, this is a very easy mathematical test, so why is it useful? Surely everybody but small children can do it now? Yes. However, in the past, not all adults even in Western countries could do this. One can locate legal documents and tomb stones from these times and analyze the amount of age heaping. The figure below shows an example of age heaping in old Italian data.

age heaping italy

Source: “Uniting Souls” and Numeracy Skills. Age Heaping in the First Italian National Censuses, 1861-1881. A’Hearn, Delfino & Nuvolari – Valencia, 13/06/2013.

Since we know that people’s ages really are nearly uniform, that is, the number of people aged 59 and 61 should be about the same as those aged 60, we can calculate indexes for how much heaping there is and use that as a crude numeracy measure. Economic historians have been doing this for some time and so we have some fairly comprehensible datasets for age heaping by now.

Is it a useful correlate?

If you read the source above you will see that age heaping in the 1800s show the expected north/south Italy patterns, but this is just one case. Does it work in general? The answer is yes. Below I plot some of the age heaping datasets versus Lynn and Vanhanen’s (2012) national IQs:

AH1800_IQAH1820_IQ  AH1850_IQAH1870_IQ AH1890_IQ

The problem with the data is this: the older datasets cover fewer countries and the newer datasets show strong ceiling effects (lots of countries very close to 100 on the x-axis). The ceiling effects are because the test is too easy. Still, the data covers a sufficiently large number of countries to be useful for modern comparisons. For instance, we can predict immigrant performance in Scandinavian countries based on their numeracy ability in the 1800s. Below I plot general socioeconomic performance (a general factor of education, income, use of social benefits and crime in Denmark in 2012) and age heaping in 1890:


The actual correlations are shown below:

AH1800 AH1820 AH1850 AH1870 AH1890 LV12 IQ S in DK
AH1800 1 0.95 0.94 0.96 0.9 0.85 0.61
AH1820 0.95 1 0.94 0.94 0.76 0.62 0.67
AH1850 0.94 0.94 1 0.99 0.84 0.73 0.59
AH1870 0.96 0.94 0.99 1 0.96 0.64 0.56
AH1890 0.9 0.76 0.84 0.96 1 0.52 0.73
LV12 IQ 0.85 0.62 0.73 0.64 0.52 1 0.54
S in DK 0.61 0.67 0.59 0.56 0.73 0.54 1


And the sample sizes:

AH1800 AH1820 AH1850 AH1870 AH1890 LV12 IQ S in DK
AH1800 31 25 22 22 24 29 24
AH1820 25 45 37 22 36 43 27
AH1850 22 37 45 27 37 43 30
AH1870 22 22 27 62 56 61 34
AH1890 24 36 37 56 109 107 50
LV12 IQ 29 43 43 61 107 203 68
S in DK 24 27 30 34 50 68 70


Great, where can I find the datasets?

Fortunately, they are freely available. The easiest solution is probably just to download the worldwide megadataset, which contains a number of the age heaping variables and lots of other variables for you to play around with: osf.io/zdcbq/files/

Alternatively, you can Baten’s age heaping data directly: www.clio-infra.eu/datasets/indicators

R code

#this is assuming you have loaded the megadataset as DF.supermega
temp = subset(DF.supermega, select = c("AH1800", "AH1820", "AH1850", "AH1870", "AH1890", "LV2012estimatedIQ", "S.factor.in.Denmark.Kirkegaard2014"))
write_clipboard(wtd.cors(temp), digits = 2)

for (year in c("AH1800", "AH1820", "AH1850", "AH1870", "AH1890")) {
  ggplot(DF.supermega, aes_string(year, "LV2012estimatedIQ")) + geom_point() + geom_smooth(method = lm) + geom_text(aes(label = rownames(temp)))
  name = str_c(year, "_IQ.png")

ggplot(DF.supermega, aes(AH1890, S.factor.in.Denmark.Kirkegaard2014)) + geom_point() + geom_smooth(method = lm) + geom_text(aes(label = rownames(temp)))

Note that perhaps there should be doubt quotation marks around human in the title. Would humans with a 1,000 SD increase in (general) cognitive ability (CA) really be human?

Steve Hsu discusses his rough estimation that we can increase CA in humans around 1,000 SD by basically turning all the current alleles with negative effects into their positive or neutral variants. While the problem seems sound enough to me, I can think of some problems.

Trait level x gene interactions

One problem is the possibility of trait level x gene interactions. For instance, suppose that a large number of genes affect pathway X to CA in a roughly linear fashion (i.e. what we find using familial studies and GCTA methods). This could be brain nerve conduction velocity (BNCV) for which there is some evidence that it is related to CA (TE Reed, AR Jensen, 1993). One seemingly mostly forgotten study did find evidence that the correlation between IQ and NCV is genetic (FV Rijsdijk, DI Boomsma, 1997). There is a physical limit on how fast BNCV can be such that the closer to get the the physical limit, the smaller increase we get from altering another negative allele to its positive version. This would be roughly equivalent to the situation in physics with the speed of light. A given amount of energy converted to kinetic energy will result in a smaller increase in m/s as we get closer to the speed of light (the physical limit).

In the comments, Hsu invokes the history of artificial selection on e.g. oil content to argue against trait level x gene interactions. See also: Animal breeding, human breeding. Brains are much more complicated than simple oil content, tho.

Brain size and BNCV

I seem to recall that due to the relatively low BNCV, there is a limit on the practice size of the brain. We know that brain size correlates with CA around .25 (large meta-analysis), which perhaps after corrections for errors will be .35 (restriction of range and measurement error; see Understanding Statistics). The reason this problem happens is that the internal brain communication will become slower at the brain size increases (brief discussion here), which presumably in the end results in a lower (possibly negative in the long run!) increase in CA from changing the alleles that result in larger brains. Solving this could mean requiring more modularization, which presumably would affect the factor structure of cognitive abilities resulting in a weaker general factor.

Brain size and reproduction

When selecting for one trait one will simultaneously select for a number of other genetically correlated traits. With cognitive ability, one of them is brain size. However, due to the physical limitation on space on women’s wombs, we cannot just scale up the scale of brains indefinitely. The relatively large human head size already results in complications with giving birth in current humans. The birth problem has probably been a relatively strong selective force against higher CA.

We can of course use Cesarians now to avoid between-the-legs birth, so it is not really a problem, but it adds costs to the reproduction process. In the long run, if we scale up brain size a lot, we would need to scale up the size of women’s interior space to accommodate the larger fetus. Note that if we just increase the size of women overall, this would result in smaller brain to brain ratio, which is what really matters. So it won’t be so easy to deal with this problem.

The final (biological reproduction) solution is to stop using women for reproduction: artificial wombs/uterus. This technology is however not being aggressively pursued as far as I know, so it is probably a number of decades away.

Of course, we will want to switch to some other neurate at some point too. :)

Reading up on the huge animal breeding literature gives a useful background to one’s thinking about what selection on humans will do in the future (embryo selection and direct editing á la CRISPR).


I made the above infograph some time ago, maybe 1-2 years. It is still pretty accurate. The newest data for genome sequencing does not look much different.

Steve Hsu has been following some of the animal breeding literature, e.g. Frontiers in cattle genomics.

I digged around a bit and found some reviews. They mentioned various interesting experiments. Of course, the most interesting experiment is still the Russian domesticated fox experiment (I want one of these!). Recently, there was an interesting one about breeding for brain size in guppies.


There is also the famous rat maze ability experiments. Solving mazes is g-loaded in humans (Jensen, 1980, book). A good review is Tolman and Tryon Early research on the inheritance of the ability to learn.


The most new and interesting part in relationship to humans is using genomic predictors alone. There is a recent, easy to read review: Understanding genomic selection in
poultry breeding.

selection for eggs

Because the animal breeding field has been going for so long, one find 100s if not 1000s of these types of graphs, yet they are still exciting. One might wonder: is there nothing one cannot select for? It seems no matter the trait, evolution finds a way. Dawkins seems to agree:

Political opposition to eugenic breeding of humans sometimes spills over into the almost certainly false assertion that it is impossible. Not only is it immoral, you may hear it said, it wouldn’t work. Unfortunately, to say that something is morally wrong, or politically undesirable, is not to say that it wouldn’t work. I have no doubt that, if you set your mind to it and had enough time and enough political power, you could breed a race of superior body-builders, or high-jumpers, or shot-putters; pearl fishers, sumo wrestlers, or sprinters; or (I suspect, although now with less confidence because there are no animal precedents) superior musicians, poets, mathematicians or wine-tasters. The reason I am confident about selective breeding for athletic prowess is that the qualities needed are so similar to those that demonstrably work in the breeding of racehorses and carthorses, of greyhounds and sledge dogs. The reason I am still pretty confident about the practical feasibility (though not the moral or political desirability) of selective breeding for mental or otherwise uniquely human traits is that there are so few examples where an attempt at selective breeding in animals has ever failed, even for traits that might have been thought surprising. Who would have thought, for example, that dogs could be bred for sheep-herding skills, or ‘pointing’, or bull-baiting?

[from The Greatest Show on Earth]

Selection for High and Low Fatness in Swine


Also interesting is that selective breeding makes it possible to estimate realized heritability, not just from family relationships.


I think we will see some interesting humans in the future. The reason is this: embryo selection is very close and genetic engineering is fairly close. If some countries ban them, others will allow them. Or one can sail or fly to a seastead. Or use any number of black market solutions that will inevitably spring up. Probably, not all jurisdictions will ban it, so there will be reproductive havens+tourism just like there are tax havens and even suicide havens. I don’t think Western governments will dare to force abortions on pregnant returnees, so there is nothing much they can do at that point. There is also of course the near-impossibility of proving that a fetus is a result of embryo selection, not normal fertilization. After all, embryo selection is just choosing between actual possibilities (hopefully, philosophy readers will allow me the flagrant abuse of modal terminology). If everybody starts having healthier children by using this technology, there will be no way to prove that a particular couple ‘cheated’. It is only in the aggregate one can prove that something is going on. A particular couple may just have been lucky. As for direct editing, it may be possible to spot genetically, but I doubt this will happen.

In the EU, I suspect the legality of this practice will come down to legal interpretation. The EU has a CHARTER OF FUNDAMENTAL RIGHTS OF THE EUROPEAN UNION, in which one can read:

Article 3
Right to the integrity of the person
1.   Everyone has the right to respect for his or her physical and mental integrity.
2.   In the fields of medicine and biology, the following must be respected in particular:
(a) the free and informed consent of the person concerned, according to the procedures laid down by law;
(b) the prohibition of eugenic practices, in particular those aiming at the selection of persons;
(c) the prohibition on making the human body and its parts as such a source of financial gain;
(d) the prohibition of the reproductive cloning of human beings.

But given that selection of persons is widely done for e.g. Down’s syndrome, (b) is clearly ignored in practice. (c) is also ignored e.g. for sperm and egg selling, altho they call it donation (with a nice monetary benefit in return). So, the best hope is that embryo selection for medical reasons will sneak into practice and become so standard that it would seem outlandish to ban it. This is well underway. When the public comes to accept it, the judges will probably make up some legal reason to interpret (b) narrowly, e.g. as to refer to forced sterilization. One may be able to find support for this in the background work for this charter, altho I haven’t looked into it.

Given that the technology will likely come into wide-scale practice within the next couple of decades, what remains to be researched more — a lot more — is how people will actually make choices. When prospective parent(s) have to make decisions re. which embryos to implement, there will be a choice. With a limited choice of embryos, one cannot simultaneously maximize all desirable traits and minimize all undesirable traits. There will probably be clear trends in this: few will select against intelligence, few will select short boys, few will select nasty diseases, most will select for health and happiness. People like Helen Henderson are not common:

I can say, without hesitation, that my life has been richer because I have MS. How can anyone who has no experience with disabilities understand that?

[From Future Human Evolution.]

If they still try to get children with horrible genetic diseases, the government probably (should?) will step in and ban it.

Still, there will be lots of variation. This variation in selective pressure between people should — together with strong assortative mating — result in divergence of human lines. This is will somewhat akin to dog, cat and horse breeds. Assortative mating is apparently so strong that people even choose pets that are similar to themselves: Self seeks like: many humans choose their dog pets following rules used for assortative mating.


We truly live in interesting times. :)

If you want to read more like this, there was also recently the double paper: Eugenics, Ready or Not I, II. (I could not find a link to part 2.)