You are currently viewing Cold winters theory: a summary of the evidence and replies to objections

Cold winters theory: a summary of the evidence and replies to objections

Cold winters theory gets at bad name, even from fellow hereditarians. In fact, the article about it was deleted from Wikipedia too. In my opinion that’s because they are not familiar with the different lines of evidence that make it plausible. As such, it would be useful with a new brief summary of the evidence. For those interested in a historical view, see this post. There are a three lines of evidence:

  • National IQs and natural correlates
  • Hunter-gatherer toolkit complexity
  • Animal ecology

National IQs and natural correlates

In the prototypical animal ecology study, we would compare the habitat of a species or subspecies with its behavioral tendencies as well as its physical measurements. We know from evolutionary theory that animals adapt to their environments, and any species of animal that spreads out into different habitats will evolve to better fit them. Cold winters theory is not special scientifically, because it is merely an instance of post out-of-Africa natural selection in humans, i.e., that as humans spread out of Africa, they encountered new environments and adapted to them accordingly. Everybody already agrees with the basic claim when it comes to phenotypes that no one cares too strongly about, such as skin color, diving ability, malaria resistance, altitude sickness adaptation and so on. Cold winters theory is the claim that cold winters specifically were important for the selection for intelligence, as cold winters are difficult to survive in.

But why that aspect of the environment and not some other one? Here we must first note the limitations of the human data. The model assumptions are:

  • Present race/ancestry groups are evolutionarily adapted to their environment, that is, there was enough time for evolution to do its work
  • They haven’t moved from their environment of evolutionary adaptedness (EEA)
  • Past environments are the same as current environments, i.e., the climate didn’t change to become warmer or more sunny

Insofar as we are using national IQ data, both assumptions are only approximate and flatly false for many countries (for other criticism of this, see Russell Warne and Noah Carl). The humans in the Americas are a recent admixture of three major populations: Europeans (the invaders), Amerindians (the invadeds), and Africans (imported slaves brought by the invader). Only the Amerindians had sufficient time to evolve there, since they crossed the Bering straight some 10k years ago (ancient genomics keeps fueling debates about the specific dates). The others only had a very brief time to adapt genetically until the advent of modernity (from 1490s to 1850s). Because of this problem, to use data from the Americas, one would have to look at only the Amerindian groups, and they would have to be unmixed and representative of the pre-Columbus peoples. Such data do not generally exist for intelligence measures, and the representativeness assumption is dubious, though it can be tested with partial polygenic scores. The peoples that remain were presumably those that specifically sought to stay alone, and were difficult to reach to begin with, thus not representative of the original Amerindians. Because of these issues, studies of human ecology for intelligence and most other phenotypes must exclude peoples of the Americas. The same is also true for other recent settler countries such as Australia/New Zealand, Taiwan, Singapore etc. This leaves a smaller set of countries and thus less statistical certainty. With that in mind, let’s look at the paper that made the case for cold winters:

The impetus for our study was the contention of both Lynn [Lynn, R. (1991) Race differences in intelligence: A global perspective. Mankind Quarterly, 31, 255–296] and Rushton (Rushton [Rushton, J. P. (1995). Race, evolution and behavior: A life history perspective. New Brunswick, NJ: Transaction; Rushton, J. P. (1997). Race, intelligence, and the brain: The errors and omissions of the revised edition of S.J. Gould’s the mismeasurement of man. Personality and Individual Differences, 23, 169–180; Rushton, J. P. (2000). Race, evolution, and behavior. A life history perspective (3rd edition). Port Huron: Charles Darwin Research Institute] that persons in colder climates tend to have higher IQs than persons in warmer climates. We correlated mean IQ of 129 countries with per capita income, skin color, and winter and summer temperatures, conceptualizing skin color as a multigenerational reflection of climate. The highest correlations were − 0.92 (rho = − 0.91) for skin color, − 0.76 (rho = − 0.76) for mean high winter temperature, − 0.66 (rho = − 0.68) for mean low winter temperature, and 0.63 (rho = 0.74) for real gross domestic product per capita. The correlations with population of country controlled for are almost identical. Our findings provide strong support for the observation of Lynn and of Rushton that persons in colder climates tend to have higher IQs. These findings could also be viewed as congruent with, although not providing unequivocal evidence for, the contention that higher intelligence evolves in colder climates. The finding of higher IQ in Eurasians than Africans could also be viewed as congruent with the position of Diamond (1997) that knowledge and resources are transmitted more readily on the Eurasian west–east axis.

The main table:

The paper is old and used the 2002 Lynn national IQs. The strongest correlation is with winter temperatures, not summer temperatures, with winter high being the strongest. The difference between the two winter correlations is probably not significant, but the difference to the summer temperatures is (p < .001). As such, based on this data, we can guess that the winter has a major role. The correlation with skin color, that is, skin brightness/lightness, is even stronger. The correlation is in fact absurdly high at .93. Some people therefore have thought that skin color itself might be causal for intelligence, though this idea has now been disproven a few times (see e.g. our study here). It was never very plausible to begin with, as between-race variation in skin color is caused only by a few genes, whereas intelligence involves thousands of genes. The skin color genes are not strongly related to intelligence, as we would have seen their overlap in GWASs. Thus, skin color must be a proxy for something else more important, but what exactly? One hypothesis is that skin color obtains such a strong correlation because of the very data problems we mentioned before: we know that our country as proxy for the environment of evolutionary adaptedness is somewhat faulty. The people who populated the countries we study didn’t exactly stay there the last thousands of years, but moved around. And the climate was surely different now than it was 3000 years ago. As such, we know the climate correlations we see are somewhat off, likely too too weak (attenuation from random error). But skin color is an adaptation to the same environment that intelligence is also hypothesized to be adapted to. Insofar as there is some evolutionary lag until a group adapts genetically to a new environment for intelligence, we would expect skin color to also show the same lag. As such, both skin color and intelligence levels should be about equally lagged, and thus show a stronger correlation than the other variables which are less accurately matched in time and space. If the selection for skin color was primarily due to vitamin D synthesis in the dark winter months to avoid vitamin deficiency, then skin color works as a proxy for the ancestral winter environment, giving it the unusually strong correlation.

Alternatively, it could be a coincidence with these old data. It would be informative to use modern estimates of skin color instead of old anthropologists’ estimates (Biasutti 1967), as well as updated national IQs, and see whether skin color still has a correlation that’s stronger than the winter temperatures. As a matter of fact, there is a lesser known study (5 citations vs. 149 for the above) that used a newer dataset of skin color:

The primary purpose of the present research is to compare two measures of skin color. The Templer & Arikawa (2006) research reported a country-level correlation of -.92 between (darker) skin color and IQ, using a measure of skin color derived from a skin color map in the physical anthropology textbook of Biasutti (1967). Meisenberg (2004) reported a country-level correlation of .89 between IQ and skin reflectance (proportion of incoming light that is reflected from the skin, greater with lighter skin), based on skin reflectance data compiled by Jablonski & Chaplin (2000). The present study found a correlation of -.96 between the two measures of skin color, indicating very good reliability of the skin color measures. The validity of these two independent measures of skin color is supported by correlations of .88 and .84 with latitude. Both skin color measures correlated .91 with IQ. The second objective of the research was the extension of the Templer & Arikawa (2006) Old World findings to 18 regions of the New World. Darker skin color correlated -.60 with measured IQ and -.97 with IQ as predicted from Old World countries with identical skin color. These results show that the country-level correlation between

Their new table:

So, the study used a new set of estimates of skin color and new set of estimates of national IQs, yet the results were practically identical. Skin color at the group level still has the strongest correlation with intelligence, then winter temperature, then summer temperature. There is also the addition of latitude, which shows about the same correlation as winter temperatures. Since latitude can’t really cause anything by itself, it must be a proxy for something else. But the size of the correlations suggest that it is not merely a proxy for winter temperatures, but for something else or more. One might guess this additional factor is seasonality. Seasons are interesting because they are predictable changes in the environment that can be planned for. The difficulty of surviving in the winter thus can be greatly helped by foresight, something we know is associated with intelligence.

Hunter-gatherer toolkit complexity

Is there some other way to independently assess whether winter or latitude is associated with ancestral variation in intelligence? Yes. As Rushton put it in his Race, Evolution, and Behavior (1997):

Another set of problems in the northern latitudes would have centered on keeping warm. People had to solve the problems of making fires, clothes, and shelters. It would have been much harder to make fires in Eurasia than in Africa, where spontaneous bush fires would have been frequent. In Eurasia during the glaciations there would have been no spontaneous bush fires. People would have had to make fires by friction or percussion in a terrain where there was little wood. Probably dry grass had to be stored in caves for use as tinder and the main fuel would have been dung, animal fat, and bones. In addition, clothing and shelters were unnecessary in sub-Saharan Africa but were made in Europe during the main Wurm glaciation. Needles were manufactured from bone for sewing together animal skins, and shelters were constructed from large bones and skins. Torrence (1983) has demonstrated an association be tween latitude and the number and complexity of tools used by contemporary hunter-gatherers.

So the key citation goes to the obscure sounding Torrence (1983), which is in fact a book chapter with some 800 citations:

Addresses the need for theoretical approaches to the study of prehistoric stone tools. Time stress is a major factor determining variations in technological behaviour among hunter-gatherers. Two effects of time budgeting are discussed. Predictions for the composition, diversity and complexity of tool-kits are illustrated by an analysis of tools used in the procurement of food. Although further work is needed before the ideas presented here can be implemented in the study of archaeological material, this preliminary attempt at theory building demonstrates that future research must account for the role of time in shaping prehistoric hunter-gatherer assemblages.

It sounds vague, but the main result is this:

The figure shows the (absolute) latitude of 20 hunter-gatherer groups that we have detailed toolkit data for. The author counted the number of different tools used by them. The correlation with latitude is strong, r = .69 (p < .01). But then again, citing a small and old study is maybe not the best we can do. Is there something newer and better? Yes, but first, here’s a replication analysis with better methods:

Variation in subsistence-related material culture is an important aspect of the archaeological and ethnographic records, but the factors that are responsible for it remain unclear. Here, we examine this issue by evaluating four factors that may affect the diversity and complexity of the food-getting tools employed by hunter-gatherer populations: 1) the nature of the food resources; 2) risk of resource failure; 3) residential mobility; and 4) population size. We apply step-wise multiple regression analysis to technological and ecological data for 20 hunter-gatherer populations from several regions of the world. The analyses support the hypothesis that risk of resource failure has a significant impact on toolkit diversity and complexity. The results do not support the hypothesis that the characteristics of the resources exploited for food influence toolkit structure, or that residential mobility affects toolkit diversity and complexity. They are also not in line with the hypothesis that population size has an impact on toolkit structure. While our analyses appear to strongly support the suggestion that resource failure risk is the primary influence on hunter-gatherer toolkit structure, we argue that it would be premature to discount the other factors at this stage, and outline the steps that we believe need to be taken next.


They used the same groups. They derived three variants of the toolkit complexity (STS, TTS, AVE), and regressed these on 8 potential variables. Considering that they only had n=20 and 8 predictors, their statistical precision cannot have been good. Nevertheless, their measure of temperature had very strong validity, even reaching p = .001 despite the sample size. Clearly, there is something to be said about temperatures. For those wondering, effective temperature is a complex summary statistic of the yearly temperature, with focus on a value that most temperatures during a year do not depart too much from. Thus, yearly variation with cold spells will have a large effect. This metric was set forth by Bailey (1960):

This replication study still used the same 20 populations, so we may ask: Is there a study with more populations? Yes.

In the study reported here we examined the impact of population size and two proxies of risk of resource failure on the diversity and complexity of the food-getting toolkits of hunter–gatherers and small-scale food producers. We tested three hypotheses: the risk hypothesis, the population-size hypothesis, and a hypothesis derived from niche construction theory. Our analyses indicated that the toolkits of hunter–gatherers are more affected by risk than are the toolkits of food producers. They also showed that the toolkits of food producers are more affected by population size than are the toolkits of hunter–gatherers. This pattern is inconsistent with the predictions of both the risk hypothesis and the population-size hypothesis. In contrast, it is consistent with the predictions of the niche construction hypothesis. Our results indicate that niche construction has affected the evolution of technology in small-scale societies and imply that niche construction must be taken into account when seeking to understand technological variation among food producers and the technological changes that occurred in association with the various transitions to farming that have occurred over the last 10,000 years.

Their dataset:

We calculated STS and TTS for 34 hunter–gatherer populations and 45 small-scale food-producing populations (Table 1) using information from ethnographic sources varying in age from the late 1800s to the mid-twentieth century. Food producers were defined as populations that derived the majority of their food from pastoralism, horticulture, or intensive agriculture and relied on locally manufactured technology at the time fieldwork was conducted. Hunter–gatherers were defined as groups subsisting primarily on wild resources at the time of fieldwork.

And their results:

The lines do not look impressive, though they are for the hunter-gatherers. The correlations were .73/.70 for latitude and .67/.61 for effective temperature with the two measures of toolkit complexity. But there was basically no association in the food producing groups (farmers). There were also correlations with population size as would be expected. More brains can give rise to more innovation that can more easily be stored in the collective brain pool. So the findings are a bit strange! Why would temperature/latitude only predict complexity for hunter-gatherers? I can come up with some guesses. First, they didn’t do a joint regression to see if the predictors might work together in non-obvious ways (suppression effects), or maybe interact. Maybe the temperature effect on farmers was missed by not controlling for an important confounder. Second, as farmer populations are much larger and later in origin, perhaps they tend to acquire a lot of tools by copying others and trade rather than innovation. This would lead to a diffusion of such tools from more innovative to less innovative peoples. Overall, the hunter-gatherer data is suggestive but farmer data doesn’t fit, despite the modern farmer data fitting. Are the data collected at the same time? Someone will have to look more closely at these datasets.

Animal ecology

Forgetting about humans for a moment and consider the same issue for animals. Animals that live in habitats with cold winters must also adapt to this and survive somehow. There’s a few strategies that come to mind: 1) they can leave, i.e. migratory species, 2) they can hibernate, i.e., entering a low energy-use stage and wait it out if they were fat enough to last through the winter (but see this study), 3) they can become behaviorally flexible enough that they can still find sufficient food in the winter. Clearly, cold winters theory predicts that migration is not a particularly intelligence selecting strategy, probably negative as the brain weighs you down while flying. Hibernation does require foresight, so it should result in some positive selection. But finding food in the difficult time of year must be yet more difficult, so should have the strongest selection. As it happens, birds are useful for this kind of contrast: Some of them leave and some of them stay. The model thus predicts that latitude/winter temperature should predict intelligence in birds that stay but not in those that leave. And we do have a study:

Brain size relative to body size is smaller in migratory than in nonmigratory birds. Two mutually nonexclusive hypotheses had been proposed to explain this association. On the one hand, the “energetic trade-off hypothesis” claims that migratory species were selected to have smaller brains because of the interplay between neural tissue volume and migratory flight. On the other hand, the “behavioral flexibility hypothesis” argues that resident species are selected to have higher cognitive capacities, and therefore larger brains, to enable survival in harsh winters, or to deal with environmental seasonality. Here, I test the validity and setting of these two hypotheses using 1466 globally distributed bird species. First, I show that the negative association between migration distance and relative brain size is very robust across species and phylogeny. Second, I provide strong support for the energetic trade-off hypothesis, by showing the validity of the trade-off among long-distance migratory species alone. Third, using resident and short-distance migratory species, I demonstrate that environmental harshness is associated with enlarged relative brain size, therefore arguably better cognition. My study provides the strongest comparative support to date for both the energetic trade-off and the behavioral flexibility hypotheses, and highlights that both mechanisms contribute to brain size evolution, but on different ends of the migratory spectrum.

Birds that migrate further away have smaller brains for bodies (by weight). This the relative brain size is a commonly used proxy for intelligence in animals, as we don’t really have a lot of cross-special intelligence data from birds (there is some!). As making brains bigger is not the only way to make them better, this is only a proxy. In humans, brain size (volume) correlates about .30 with intelligence, so there is a lot of other brain variation left. But between species and subspecies, it is probably reasonably accurate.

Winter (non-breeding) temperatures predict relative brain size in the right direction. The author summarizes his findings like this:

Nonbreeding minimum temperature has a strong effect on brain size in both fully resident and short-distance migratory species (Table 2, Fig. 3); the lower the nonbreeding minimum temperature, the larger the brain size (Table 2, Fig. 3). Indeed, the effect of nonbreeding minimum temperature was comparable across different migratory intervals between 0 and 500 km, but not above 500 km (Table 2). In several species subsets, nonbreeding minimum temperature is the only significant predictor of relative brain size, while seasonality and nonbreeding latitude have little predictive power. Where significant, brain size increases with seasonality and increases with increasing nonbreeding latitude (Table 2); all results were highly consistent when repeated using just passerines (Table 2)

The analysis is not entirely satisfactory. The use of subset regression is not the optimal approach, interaction terms would have been better. The models were also not reported in full, so it is difficult to see exactly what is going on. Nevertheless, we do see roughly the expected patterns. As before, this analysis has to be redone.

This is not the only such study. Back in 2017, I compiled a bunch of studies of animal ecology that examined (relative) brain size in relationship to latitude or temperatures. The literature is not dispositive but tends to show that colder temperatures and distance to equator predicts relative brain size in multiple species, aside from humans. I’m sure there are more studies by now, so one will have to re-do this literature review.


Recall, no one claims that winter temperature or latitude is the only explanatory factor in explaining variation between species or subspecies, just that it is an important one, and for humans, probably the most important. One can easily look at a scatterplot between national IQs for our ‘stable populations’ (those that did not move so much) and point to glaring outliers. Take China for instance. By all accounts, the Han Chinese are smart at about 105 IQ (Han is the largest ethnic group). However, looking at the map of temperatures above, their environment is warmer than Europeans’, especially in the south of China, yet the Chinese have higher intelligence, even the southern ones (Hong Kong is in the south). How can it be? There are a number of possible answers. First, perhaps the Han Chinese evolved higher intelligence in the Beijing area and migrated southwards in recent history, and so retained their selection for intelligence from a colder climate. Second, perhaps the climate used to be different — colder — and the Han Chinese are still adapted to this climate. Third, perhaps some other factor had a relatively stronger role here, such as internal eugenic selection for social reasons. As we showed in our Mormon study, it is possible for different religious and political groups to show different selections of selection against intelligence, so it stands to reason that in the past, some cultures selected for it, and others against it. The Chinese famously used civil servant tests, which does point to a culture with a strong academic bent. This then has to be added to the existing selection pressure from the climate. Fourth, there can be other less powerful effects of the environment that was in effect. For instance, consider this map of the biomes of the earth (Lomolino 2020):

Here we see that most of China actually has the same biome as most of Europe: temperate deciduous forest/subtropical evergreen forest. The problem with adding more and more such potential causes is that we only have some ~130 countries standing in for ethnic groups to work with. One could — should — try adding them all together in Bayesian meta-analysis and see what comes out ahead. For more work on biomes and human intelligence, see Figueredo et al 2020.

The Arctic populations represent another cluster of outliers. It is certainly colder in the Arctic than in Europe or in Northeast Asia. But all our data with Arctic peoples show that their intelligence is not higher, but somewhat lower (about 90) than the temperate peoples (about 100). Why is that? We don’t really know of course, but the most plausible factor is population size. Population density is a function both of the intelligence of the people inhabiting the lands, but also of the inherent energy supply of the environment. Before modern technology, it simply wasn’t possible to extract energy efficiently from the Arctic environment to sustain a large population size. Farming is impossible, so only hunter-gathering is possible. As population size shrinks, so too does the chance for new mutations to arise and thus spread. Insofar as new mutations were important in the evolution of human intelligence, this would then slow down selection for intelligence, though not change the optimal value. It would take a longer time to reach it. Second, adaptability to a given environment inherently depends on the availability of useful resources, including other animals. There is a well known latitude gradient with biodiversity (Mannion et al 2014):

Deserts and arctic regions provide almost nothing to work with, so in a sense, it doesn’t matter how smart you are if there is nothing to exploit. This would perhaps reduce the selection pressure for intelligence in such regions. The Kalahari desert people in Africa have particularly low intelligence and that seems also true for the Aborigines in Australia, which are both desert peoples. Note again here that China has a higher biodiversity than Europe.

Historical considerations are another objection. Currently, it is true, the smartest people live in temperature climates. But historically, it seems the most clever people lived in warmer climate, as judged by their ability to invent agriculture (Mesopotamia, India etc.) and, well, Western civilization (in Greece and Rome, not in Scandinavia!). Just 2000 years ago, southern Europe was clearly more impressive than northern Europe. Doesn’t this show that the latitude and temperature association we currently see with intelligence is perhaps more of a recent phenomenon, perhaps a coincidence? Not exactly. The problem is using civilization as a measure of (genetic) intelligence of peoples. This is clearly a very strong association now, but it wasn’t necessarily 2000 or 5000 years ago. To create a civilization, one must first invent agriculture, so that population size can increase from farming. Farming is more difficult in colder places with shorter growing seasons and less sunlight for plants. Agriculture would first be implemented in the places where it is the easiest to get started, even if the people living there weren’t the brightest. As technology improves and diffuses, it would then later become possible to implement agriculture in increasingly cold places. These two causes of agriculture — intelligence and inherent ease of doing it in that climate — could explain why it originated first among peoples who weren’t the smartest on Earth, but weren’t the dullest either.

To repeat, no one thinks that cold winters are the only selective force for intelligence. The dominance of the Greek, Roman, Persian, Inca etc. empires could also have been due to them evolving particularly high intelligence for other reasons, e.g.eugenic cultures, biomes, only to later lose it after the selection pressure was reversed e.g. introduction of Christianity or Islam. This is the basis of the genetic cyclical theories of the rise and fall of civilizations. Based on this, it is expected that the polygenic scores for intelligence were indeed higher in the Roman empire compared to before and after, and in Greece at its height. It remains to be seen whether ancient genomics will bear out this prediction (but stay tuned!).

The existence of merchant peoples like the Jews and Parsis and maybe Igbos, show that there is more than one evolutionary path to evolving high intelligence. These groups evolved high intelligence without living in the cold, and conspicuously higher than their neighbors in the same climate. The relative rarity of such groups suggests, however, that climate related factors were the most important overall.

But can it be really tested? Above, I set out a bunch of hypotheses for why the climate does not explain all variation in human intelligence. But of course, it has a very ad hoc nature. Is there a way to do a strong test? An experiment? Yes, though it would be expensive. Instead of relying on natural variation in climate between species and subspecies and trying to deal with alternative explanations of the data (confounders), one could artificially modify the environments at random for populations of an existing species. Take mice for instance. I don’t know of any study that looked into the global variation in mice brain sizes in comparison to their habitats. However, we could gather a population of such wild mice, split them up at random into different hangars. Inside each hangar we would construct a natural-like environment with different food sources, predators, seasons etc. Over time, we would then change the climate of the hangars in different directions: some towards equatorial, some towards temperate. Doing this we would also introduce seasonality and rain/snow fall. Cold winters theory predicts that the mice so exposed to increasingly cold winter environments would, among other things, be selected for intelligence. We would collect all the mice for genetic testing and measurements of brain size. Such a study could in theory be done, though it would obviously be expensive to run the hangars. It should be said that such long-term selection experiments are not unheard of. They have been done — are being done — for several species of livestock, other animal species and for some plants. In those experiments, though, humans are artificially doing the selection, for instance, for higher or lower oil content in plants, and even brain size in fish. With increasing prosperity of humanity, it is possible that we could be running some of these climate experiments. It is certainly within our current budges too. Consider the amount of money spent on testing fundamental theories in physics. The large hadron collider (it is 26 km in diameter) cost several billion Euros to build.


  • Cold winter theory enjoys three relatively independent lines of support
  • The various objections to it are not fatal, once one keeps in mind that no one claims that climatic factors are the only explanation for population variation in intelligence
  • Aspects of the theory can be tested using by better analyses of existing data — human and non-human — and future ancient genomic data
  • Cold winter theory is testable in theory, and even in practice if some crazed billionaire decided to do it

More reading