There’s a large number of papers about the relationship between national intelligence and national scholastic scores, and to which degree the latter indexes the former. Some typical papers:
-
Rindermann, H. (2007). The g‐factor of international cognitive ability comparisons: The homogeneity of results in PISA, TIMSS, PIRLS and IQ‐tests across nations. European Journal of Personality: Published for the European Association of Personality Psychology, 21(5), 667-706.
-
Lynn, R., & Mikk, J. (2009). National IQs predict educational attainment in math, reading and science across 56 nations. Intelligence, 37(3), 305-310.
-
Meisenberg, G., & Woodley, M. A. (2013). Are cognitive differences between countries diminishing? Evidence from TIMSS and PISA. Intelligence, 41(6), 808-816.
-
Jones, G., & Potrafke, N. (2014). Human capital and national institutional quality: Are TIMSS, PISA, and national average IQ robust predictors?. Intelligence, 46, 148-155.
Because of papers like this, of which there’s 100s, it is actually quite difficult to find studies that relate scholastic tests to intelligence at the individual level since there’s no simple way to remove the national-level results from the search results (as far as I can work out!). However, John Fuerst was able to find two:
-
Saß, S., Kampa, N., & Köller, O. (2017). The interplay of g and mathematical abilities in large-scale assessments across grades. Intelligence, 63, 33-44.
-
Flores-Mendoza, C., Ardila, R., Rosas, R., Lucio, M. E., Gallegos, M., & Colareta, N. R. (2018). Intelligence Measurement and School Performance in Latin America.
The first reports several latent correlations, finding β = .70 or so. The second reports some correlations between Raven’s SPM and PISA, at r = .56 or so. So the first is already adjusted for measurement error because it’s SEM based. However, the intelligence tests were pretty short, so there is some construct invalidity (i.e., it’s not measuring a broad enough g). In the second case, one has to adjust for measurement error and construct invalidity. Probably in both cases, doing these adjustments would bring up the estimated true correlation to about .80 or so. This is the typical value seen for other achievement tests and intelligence batteries in SEM. E.g.:
-
Deary, I. J., Strand, S., Smith, P., & Fernandes, C. (2007). Intelligence and educational achievement. Intelligence, 35(1), 13-21.
-
Kaufman, S. B., Reynolds, M. R., Liu, X., Kaufman, A. S., & McGrew, K. S. (2012). Are cognitive g and academic achievement g one and the same g? An exploration on the Woodcock–Johnson and Kaufman tests. Intelligence, 40(2), 123-138.
-
Zaboski II, B. A., Kranzler, J. H., & Gage, N. A. (2018). Meta-analysis of the relationship between academic achievement and broad abilities of the Cattell-Horn-Carroll theory. Journal of school psychology, 71, 42-56.
These studies generally find r = .70 to .80, towards the latter when stronger methods are used to estimate the latent relationship instead of the observed.
As a die-hard Jensenist, I will of course note that Jensen wrote back in 1969:
The Stanford-Binet and similar intelligence tests predict various measures of scholastic achievement with an average validity coefficient of about .5 to .6, and in longitudinal data comprising intelligence test and achievement measures on the same children over a number of years, the multiple correlation between intelligence and scholastic achievement is almost as high as the reliability of the measures will permit.
For the reader who wants a lot more, there is a 100+ page book chapter on it from 1993:
- Jensen, A. R. (1993). “Psychometric g and achievement“. In B. R. Gifford (Ed.), Policy perspectives on educational testing. Norwell, MA: Kluwer Academic Publishers. Pp. 117-227.
Honorable mentions
Longitudinal studies:
-
Barth, E., Keute, A. L., Schøne, P., von Simson, K., & Steffensen, K. (2019). NEET Status and Early Versus Later Skills Among Young Adults: Evidence From Linked Register-PIAAC Data. Scandinavian Journal of Educational Research, 1-13.