What exactly is age heaping and what use is it?

What is age heaping?

Number heaping is a common tendency of humans. What this means is that we tend round numbers to the nearest 5 or 10 (those of us that use the decimal system!). Age heaping is the tendency of innumerate people to round their age to the nearest 5 or 10, presumably because they can’t subtract to infer their current age from their birth year and the current year. Psychometrically speaking, this is a very easy mathematical test, so why is it useful? Surely everybody but small children can do it now? Yes. However, in the past, not all adults even in Western countries could do this. One can locate legal documents and tomb stones from these times and analyze the amount of age heaping. The figure below shows an example of age heaping in old Italian data.

Source: “Uniting Souls” and Numeracy Skills. Age Heaping in the First Italian National Censuses, 1861-1881. A’Hearn, Delfino & Nuvolari – Valencia, 13/06/2013.

Since we know that people’s ages really are nearly uniform, that is, the number of people aged 59 and 61 should be about the same as those aged 60, we can calculate indexes for how much heaping there is and use that as a crude numeracy measure. Economic historians have been doing this for some time and so we have some fairly comprehensible datasets for age heaping by now.

Is it a useful correlate?

If you read the source above you will see that age heaping in the 1800s show the expected north/south Italy patterns, but this is just one case. Does it work in general? The answer is yes. Below I plot some of the age heaping datasets versus Lynn and Vanhanen’s (2012) national IQs:

The problem with the data is this: the older datasets cover fewer countries and the newer datasets show strong ceiling effects (lots of countries very close to 100 on the x-axis). The ceiling effects are because the test is too easy. Still, the data covers a sufficiently large number of countries to be useful for modern comparisons. For instance, we can predict immigrant performance in Scandinavian countries based on their numeracy ability in the 1800s. Below I plot general socioeconomic performance (a general factor of education, income, use of social benefits and crime in Denmark in 2012) and age heaping in 1890:

The actual correlations are shown below:

	AH1800	AH1820	AH1850	AH1870	AH1890	LV12 IQ	S in DK
AH1800	1	0.95	0.94	0.96	0.9	0.85	0.61
AH1820	0.95	1	0.94	0.94	0.76	0.62	0.67
AH1850	0.94	0.94	1	0.99	0.84	0.73	0.59
AH1870	0.96	0.94	0.99	1	0.96	0.64	0.56
AH1890	0.9	0.76	0.84	0.96	1	0.52	0.73
LV12 IQ	0.85	0.62	0.73	0.64	0.52	1	0.54
S in DK	0.61	0.67	0.59	0.56	0.73	0.54	1

And the sample sizes:

	AH1800	AH1820	AH1850	AH1870	AH1890	LV12 IQ	S in DK
AH1800	31	25	22	22	24	29	24
AH1820	25	45	37	22	36	43	27
AH1850	22	37	45	27	37	43	30
AH1870	22	22	27	62	56	61	34
AH1890	24	36	37	56	109	107	50
LV12 IQ	29	43	43	61	107	203	68
S in DK	24	27	30	34	50	68	70

Great, where can I find the datasets?

Fortunately, they are freely available. The easiest solution is probably just to download the worldwide megadataset, which contains a number of the age heaping variables and lots of other variables for you to play around with: https://osf.io/zdcbq/files/

Alternatively, you can find Baten’s age heaping data directly: https://www.clio-infra.eu/datasets/indicators

R code

#this is assuming you have loaded the megadataset as DF.supermega
temp = subset(DF.supermega, select = c("AH1800", "AH1820", "AH1850", "AH1870", "AH1890", "LV2012estimatedIQ", "S.factor.in.Denmark.Kirkegaard2014"))
write_clipboard(wtd.cors(temp), digits = 2)
write_clipboard(count.pairwise(temp))

for (year in c("AH1800", "AH1820", "AH1850", "AH1870", "AH1890")) {
  ggplot(DF.supermega, aes_string(year, "LV2012estimatedIQ")) + geom_point() + geom_smooth(method = lm) + geom_text(aes(label = rownames(temp)))
  name = str_c(year, "_IQ.png")
  ggsave(name)
}

ggplot(DF.supermega, aes(AH1890, S.factor.in.Denmark.Kirkegaard2014)) + geom_point() + geom_smooth(method = lm) + geom_text(aes(label = rownames(temp)))
ggsave("AH_S_DK.png")

You Might Also Like

Researcher degrees of freedom as sensitivity analysis

The eugenic effect of religiousness confirmed

Cold winter theory in non-human animals