Review: Intelligence: all that matters (Stuart Ritchie)

I was recently asked to suggest introductory reviewing materials for bright laymen (see also my previous post). I ended up giving the following recommendations:

– Intelligence: A very short introduction (2001, by Deary, very mainstream researcher). 132 pages. Libgen

– Intelligence: All that matters. 2015. Ritchie (young, mainstream researcher). 160 pages. Libgen

Bit more technical, focused on social effects:

– Why g matters. 1997. Gottfredson (just received her lifetime achievement award at ISIR2016). 54 pages. PDF

Focused on American context, but also fairly nontechnical:

– The Bell Curve. 1994. Herrnstein and Murray (mainstream researchers). 845 pages. Libgen

The change is the addition of Ritchie’s book, which I hadn’t actually read, only heard good things about. Since it doesn’t seem to be proper to recommend books one hasn’t read, I decided to read it today.

The general verdict is that it is probably the best go-to book for introducing someone to intelligence and related research now. It’s somewhat better than Deary’s similar, but somewhat dated (2001) book, and it’s less technical than Gottfredson’s great 1997 article, and much shorter than The Bell Curve, which is also notably dated by now. The main downside is the mediocre treatment of group differences, a topic that has widespread public interest. It may be a deliberate choice to avoid getting too controversial. After all, getting readers to accept stuff about the high heritability of IQ for adults, existing of g-factor, etc. is hard enough. :)

I have some complaints and comments:

It’s common to hear that ‘correlation does not equal causation’, and you should bear this in mind as we discuss the many correlations that have been found involving intelligence test scores. For instance, if intelligence and educational achievement are correlated, this might mean (a) that intelligence causes you to do better in school, (b) that schooling helps you do better on intelligence tests, (c) that something else, perhaps social background, causes you to do better at school and on intelligence tests, or (d) a mixture of all the above. Nevertheless, we shouldn’t forget that a correlation does sometimes imply causation: it’s difficult to think of any instances of causation where there’s no correlation.

The use of imply here is incorrect. It’s nonsense to say that something sometimes implies something. Implication is an either/or concept. Still, causation does imply correlation but it depends on the exact interpretation I think.

Emotional Intelligence

The world of work is particularly prone to fads for techniques that claim to improve productivity. The techniques often have very little evidence to back them up. Is the recent trend for measuring ‘emotional intelligence’ just one of these? Emotional intelligence (sometimes called ‘EQ’) is presented as a measure of the ability to understand the way that you and others are feeling, which would clearly be of importance in jobs that involve dealing with people. The question is, does EQ tell us any more than the measures psychologists already use, like IQ and personality tests? Several studies have shown that emotional intelligence is linked to better performance at work. Importantly, though, it’s not as strongly linked to that performance as IQ (Joseph and Newman, 2010). This may be because we’re simply better at measuring IQ. Some researchers argue that EQ is just a trendy, and less useful, re-description of what we already knew.

It may or may not be worth mentioning that EQ comes in two varieties by the same name: trait (personality) version and ability version. The first is similar to already existing personality measurements and seems to be a kind of self-rated general factor of personality, and the second seems to be moderately to strongly related to standard cognitive ability measures. There is a lack of predictive studies that employ all measures.

One subset of this Study of Mathematically Precocious Youth – as it’s called, although these youths were also precocious in non-mathematical areas – were the best of the best: their SAT scores were the top 0.0001 per cent of the population. And 30 years after they had taken the SAT, these 320 ‘scary smart’ people (to quote the researchers) had achieved an astonishing amount (Kell et al., 2013). They had become high-ranking politicians, CEOs of companies, high-ups in government agencies, distinguished academics, journalists for well-known newspapers, artists and musical directors. They had been awarded patents, grant money and prizes, and had produced plays, novels, and a huge amount of economic value. They had, in other words, made incalculable contributions to society, for everyone’s benefit.

It would have been good to take the Steve Hsuian route and just display the usual figure:


It can be clearly seen that within the very bright group, the brightest did better than the least brightest. I also found it odd that the book introduces Terman earlier, but doesn’t mention the Terman results for very high ability groups.

Figure 4.1  The evolution of the human brain. The cranial capacity (skull size) of our ancestors, estimated from the sizes of fossilized skulls, has increased across time. Shown on the graph are the names of different ancestral species, and our species, indicating the approximate time of their first appearance in the fossil record. (Data from Potts, 2011.)

It may be worth mentioning that brain size for humans seems to have decreased in recent evolution (last 10k years) only to go up again recently, perhaps in relationship to the FLynn effect?

There’s little reason to doubt that larger brains mean better cognition: more brain cells allow for more complex mental processing, in much the same way as adding more RAM to your computer will allow it to run more software without crashing. The archaeological record certainly shows increases in the complexity of tool use, society and culture alongside the growth in brain volume. But the brute size of the brain is far from the full story on the evolution of intelligence. After all, the brains of elephants are much larger than ours, and whereas they’re smart compared to most other animals, nobody is predicting that they’ll be launching rockets to the moon any time in the near future.

Ritchie neglects to mention that it is not really size itself, but something like brain-to-body ratio (or an adjusted variant) that matters.

I didn’t pick those numbers at random: on average, behaviour genetic studies of intelligence have found this same 50 per cent figure. That is, half of the reasons why people vary on intelligence test scores are genetic. Intriguingly, it’s been found that the genetic effect on intelligence is stronger in adults (heritabilities of up to 80 per cent) than it is in children (around 20 per cent), suggesting that our biology becomes more important for our intelligence as we age (Plomin and Deary, 2014). Perhaps different genes come into play as we get older, or the ability of the environment to influence intelligence wanes. Regardless of the age, though, so long as intelligence can reliably be measured, twin studies show that it’s substantially heritable.

Ritchie seems to suggest that early life intelligence is non-biological. I think he meant to write our genes become more important as we age.

Going beyond brute size and looking at specific brain areas gives us a more nuanced view. Most commonly studied are the frontal lobes, the parts of the brain just above the eyes and directly behind the forehead. Patients with damage to their frontal lobes caused by head injuries, strokes or infections have particular trouble with tests of ‘fluid’ intelligence (see Chapter 2), which involve abstract thinking. This has led some prominent researchers, like neuroscientist John Duncan (2010), to hypothesize that the frontal lobes are responsible for tasks particularly relevant for intelligence, such as planning, organizing and reasoning.

Somewhat surprised to see him not citing his own study on this topic. Perhaps the book was already in print.

In the past ten years, a raft of video games has appeared on the market that claim to improve your brain function. They often include simple tasks, such as counting the number of syllables in phrases, to provide seemingly important – but non-scientific – information like your ‘brain age’. The vast majority of these games are not based on scientific evidence. But one particular variety of brain training has been extensively studied in the lab. The science behind it turned out to be extremely controversial.

Maybe worth mentioning that these attempts at making stupid or retarded people less stupid or normal has a very long, very dubious history. All the more reason to be skeptical of recent claims.

In the 1960s, the Norwegian government decided to add two extra years of schooling to the mandatory curriculum for all pupils. Two additional pieces of good luck allowed researchers, who came on the scene much later, to turn this into a test of the effects of schooling on IQ. First, the reform was implemented across the different parts of Norway in a staggered way – it happened in some areas years before it happened in others. Second, every male Norwegian sits an IQ test as part of their compulsory army service. In 2012 the researchers were able to compare the later IQ scores of those who had been forced to stay in school for extra years with the scores of those who hadn’t (Brinch and Galloway, 2012). They worked out that the extra schooling added 3.7 IQ points per year. This confirmed the results of many other, previous studies that hadn’t used such elegant methods.

As I recall, this study produced the largest such effects observed, but I don’t have a cite at hand. I seem to recall other studies found effects around 2 IQ per year. Maybe someone else knows?

The same argument can be made for workplace efficiency: those with slightly higher IQs should, on average, be slightly more productive, and these small effects would add up to rather a lot across the whole country, saving employers a great deal of money as the employees make fewer mistakes and finish more tasks. Small effects can matter a great deal if they’re across populations. By raising everyone’s IQ by a small amount – so long as the rise was truly an improvement in intelligence and not just in the ability to take a specific test (we wouldn’t get very far just by telling people the answers to vocabulary test questions, for instance!) – a country could save enormous amounts.

When reading this, I was sure that after having cited some of the studies showing IQ gains from education, he would address the thorny issue of raising IQ scores vs. raising general cognitive ability. After all, Ritchie has himself published a study investigating whether such gains were on g or not (finding that they were not on g). Perhaps another study that didn’t get published in time for the book. But really, this issue is important enough that any introductory book should cover it. IQ is just a metric (or a ‘vehicle’). It is easy to raise, one can just give the students the answers to the tests or have them retake the same tests over and over. But no one thinks this actually makes them smarter (I hope) which also means that one has to be careful with treating IQ changes as changes in the ability level (ratio measurement issue).

Until now, I’ve talked about standard schooling: the kind that every child gets as a matter of course. But there have also been intensive educational projects aimed at intervening early to boost the life chances of very disadvantaged children. The two most famous examples are the ‘Perry Preschool Project’, from the 1960s, and the ‘Abecedarian Project’ from the 1970s (though there have been many others). In both of these US projects, children deemed to be ‘at risk’ because of their low social-class backgrounds were given a structured programme of extra preschool teaching to get them prepared for starting education. Because the interventions were done decades ago, we’re now able to look at the long-term effects, if any, of the programmes. Whereas a short-term boost to IQ was seen, this tended to peter out by the time the children reached adulthood. By that point, there was hardly any IQ difference between those who got the extra preschool and those who didn’t (Barnett, 1998). However, even if IQ wasn’t lastingly affected, there did appear to be other extremely valuable effects, such as reduced rates of criminal offending in the preschool group.
Much larger-scale (but, by economic necessity, less intensive) versions of the projects have been run by successive governments in the US and the UK (called Head Start and Sure Start, respectively). There is intense debate over whether the benefits of these interventions are worth the costs (Puma et al., 2010). Even if the efforts until now have been patchy – especially with respect to long-term effects on IQ – they still provide a great deal of useful data. This helps researchers work out the precise ingredients that might help teachers boost children’s thinking skills.

Pretty misleading (citation bias) to mention the super positive results from these extremely small 60s-70s studies, when recent large studies show more or less zero lasting gains, AND when meta-analysis shows obvious publication bias in these type of studies. E.g. read the huge Head Start Impact Study and see my plot of effect size and standard error in early intervention studies.

From a rather depressing start, with the failure of the Mozart Effect and the shaky evidence for the benefits of brain training, we’ve come to a somewhat more optimistic conclusion. It’s difficult for an IQ researcher to be very enthusiastic about new techniques that purport to raise intelligence, after all the disappointments of the past. Certainly, the next time you see a technique, game, supplement or pill that claims to boost your brainpower, regard it with extreme scepticism. Nevertheless, we’re lucky that the tools for raising intelligence – which might partly have caused the Flynn Effect – seem to be staring us in the face, in the form of education. Now, intelligence researchers have to find out exactly how education has these effects, and how we can make the most of them.

I don’t know about that. Meta-analysis of FLynn finds that age is not a moderator. If education is a cause of FE, gains should be larger for those who have taken more of it. Since he cites a big meta-analysis of the FLynn effect (Trahan et al 2014), it seems he should have known this finding.

But it’s not quite so simple. Just looking at the average hides two consistent sex differences. The first is that there are differences in more specific abilities: women tend to do better than men on verbal measures, and men tend to outperform women on tests of spatial ability (Miller and Halpern, 2014); these small differences balance out so that the average general score is the same. The second is that there is a difference in variability: males tend to be over-represented at the very high and the very low levels of intelligence. This was found most clearly in the Scottish data.

There is also the problem of using standard tests to measure gender differences, when test constructors often try to reduce such differences by removing items showing gender differences.

By far the most taboo topic surrounding intelligence, the ‘race-IQ controversy’ flares up every so often and leads to furious debate. One infamous instance was in 1969, when Arthur Jensen wrote a book-length paper contending that educational and intellectual differences between Black and White Americans were partly due to genetics, and that education had failed to equalize them. Another was in 1994, when Richard Herrnstein and Charles Murray published The Bell Curve, a long and detailed book that discussed the importance of intelligence for society, but also argued that different ethnic groups may see different levels of success dependent on their group’s average intelligence. The argument extends across countries: some researchers have attempted to collate the results of intelligence tests to show that people in some parts of the world (such as East Asia) are more intelligent than others (such as Europe and Africa).

Odd to see him cite such research but not actually any citations. A citation to Lynn and Vanhanen (2012) seems appropriate here given the (mostly Lynn’s) enormous amount of labor that went into compiling these studies.

As you might expect, there is intense disagreement over this kind of research (Nisbett, 2010). Can we be so sure, ask some researchers, that any IQ differences between races really reflect intelligence differences (that is, are the tests culturally biased against some groups)? And even if we could be, how certain are we that differences are genetic, and not caused by social or economic differences (such as historical or current poverty and racism)? Genes influence intelligence, but this doesn’t necessarily mean that they influence group differences in intelligence, too. Also, if there are country-level differences in intelligence, shouldn’t we expect them to narrow over time, as poorer countries develop and experience their own Flynn Effects (see Chapter 5)? The answer to all these questions is, unfortunately, that we don’t really know. The area is so toxic and scandal-prone that most researchers (and research funders) give it a wide berth. This means that there’s far less high-quality research in this area than in the other topics we’ve discussed (see Hunt and Carlson, 2007, for a good summary of the issues).

Citing Nisbett and Hunt and Carlson as the only citations is very misleading. These people don’t actually conduct cross-national research themselves, only comment on the supposed ethical and the real scientific problems with it. Hunt later changed his mind before dying. I think cites of Rushton and Jensen (2005) and Lynn (2006) are in order, so that the reader can see both sides of the debate (a good idea is to read the 2005 special issue). The only way to find out what is true and what is false is to conduct high quality research on them.

In general, it would have been better to also include an actual brief review of country-level and subcountry-level research, which basically shows that IQ estimates have strong to very strong relationships to outcomes: e.g. national IQ x S, r=.87; authority level IQ x S in the UK, r=.56. The interpretation of such results is of course difficult, especially because they often use educational achievement scores, not IQ test scores, which are presumably more environmentally influenceable.

This is not, of course, to absolve IQ researchers of any blame: there have been many misleading arguments from a pro-IQ perspective, with some researchers – undoubtedly with their own political agendas – making grand, sweeping claims about society and history on the basis of very thin data on intelligence (one example is described by Wicherts et al., 2012). All of this adds up to a very unsatisfying debate: IQ critics often write off intelligence testing on the basis of this minority of slipshod work, leaving the vast majority of sensible scientists, and their painstakingly collected data, standing on the sidelines.

Pretty silly example. This debate is about whether Sub-Saharan African IQs are around 70 or 80 — both of which are very low by Western standards — and how to do meta-analysis (mostly about inclusion criteria). A compromise review is Rindermann’s.

7  Bias in Mental Testing (1983) by Arthur Jensen. One of the most prominent (and controversial) intelligence researchers of the twentieth century asks whether IQ tests are biased against certain groups.

1980, not 1983.