Standard deviation of total SAT scores: not simply the sum of the standard deviations of subtests

Someone on Reddit for SlateStarCodex:

AFAIK, SATs are normed to mean of 500 per section and standard deviation of 100. Assuming that math and verbal are highly correlated, that approximates to 1000 mean and 200 standard deviation for the SAT 1600 scale. The median reported score score was 1490, and even the 10th percentile was 1320. Which means that half of SSC’ers are in the top percentile of intelligence, and 90% are in the top 5% of US population. This is about in line with reported IQ, where median was 137 (equivalent to 1493 SAT score using 100/15 IQ scale) and 10th percentile 124 (equivalent to 1320 SAT score).

Then someone else:

Standard deviations don’t add like that when you combine distributions, do they? I thought there’d be a square-sum-root step in there somewhere.

Second someone is correct. In fact, the assumption that first someone mentions goes the exact opposite way.

—

No, they don’t. Variance is simply additive but only when variables are uncorrelated. When they are correlated, the variance grows faster (‘super-additive’).

https://en.wikipedia.org/wiki/Variance#Basic_properties

SAT subtests (M, V) presumably correlate at .73 or so. So variance for combined metric is about sqrt(10000 + 10000 + 2*7300) = 186. To make sure we did it right, let’s simulate some SAT-like data.

> library(tidyverse)
 > sat = MASS::mvrnorm(n = 1e6, mu = c(500, 500), Sigma = matrix(c(10000, 7300, 7300, 10000), nrow = 2), empirical = T) %>% 
 + as.data.frame() %>% 
 + set_colnames(c("M", "V")) %>% 
 + mutate(total = M + V)
 > cor(sat)
 M V total
 M 1.00 0.73 0.93
 V 0.73 1.00 0.93
 total 0.93 0.93 1.00
 > psych::describe(sat)
 vars n mean sd median trimmed mad min max range skew kurtosis se
 M 1 1e+06 500 100 500 500 100 6.5 989 982 0 0.00 0.10
 V 2 1e+06 500 100 500 500 100 -1.9 963 965 0 -0.01 0.10
 total 3 1e+06 1000 186 1000 1000 186 118.9 1802 1684 0 0.00 0.19

Math checks out.

See also interesting case study where this matters.

The Composite Score Extremity Effect

Standard deviation of total SAT scores: not simply the sum of the standard deviations of subtests

This Post Has 2 Comments

Leave a Reply Cancel reply

You Might Also Like

Polygenic score validity and group differences

FAQ for “Cognitive ability and political preferences in Denmark” Kirkegaard, Bjerrekær, Carl (2017)

Convenience samples are fine because interactions are mostly not real

This Post Has 2 Comments

Leave a Reply Cancel reply