gordon-1997-everyday-life-as-an-intelligence-test-effects-of-intelligence-and-intelligence-context (Thanks to this guy for making me aware of the paper)

“Take, for example, complaints concerning a Year IV-6 Stanford-Binet item, which shows three pairs of sharply contrasting drawings of faces and asks, “Which one is prettier ?” The item has been held up to ridicule as an example of “aesthetic comparison,” which is what it is unfortunately called (Terman & Merrill, 1960, p. 79), rather than of choice between responses that are cognitive and so capable of being judged better or worse unequivocally (Jensen, 1980a, p. 5). Standards of beauty can vary from culture to culture, it was pointed out, and so right or wrong cannot be settled objectively, unlike such responses as “taller” or “shorter.” To such critics, the item lacked, no pun, face validity.”

I lol’d.

“The  key  intellectual  problem,  as  one  judge  saw  it,  was  that  the  jurors  “were
remarkably  poor  evaluators  of  the  facts”  and  had  consistently  confused  logical
possibilities  with  grounds  for  reasonable  doubt  by  neglecting  to  give  sensible
consideration  to probabilities  (Rothwax,  1996, pp.  226,  229). A  law professor  cast
the  problem  more  broadly:  “Modem  trials  hinge  on  complicated  assessments  of
economic  models  and DNA  evidence  .  .  . Lay juries  are  simply  not  equipped  to
perform  this  assessment”  (Dow,  1995, p. A32). Coincidentally,  a national  survey
of adults  in  1995 had  found  that “only one  in five Americans  [21%] can provide  a
minimally  acceptable  definition  of DNA”  (National Science  Board,  1996, pp.  7-
8, Appendix  Table 7-7).
During  expert  but  often  dull  DNA  testimony,  the jurors
were  visibly  bored  (Shapiro,  1996,  p.  353;  Toobin,  1996,  p.  345).  DNA  was
mentioned  10,000  times  (“Simpson  Trial & Trivia,”  1995). When  presented  with
the  statement,  “The  Simpson  jury  just  wasn’t  smart  enough  to  understand  the
evidence  in  the  case,”  26%  of Whites,  but  also  10% of  Blacks,  agreed  (Morin,
1995, p. A34).  If it is reasonable  to regard  the  task of jury  service  as a kind  of job,
validity  generalization  theory  would  indicate  that performance  of  that job  must  be
related  to g,  most  especially  in  complex  trials  (e.g.,  Gottfredson,  1986b). “

For lots of more similarly depressive survey data, see this book.

The paper is really interesting. The author uses a very simple population-IQ-model that is so simple that is very surprising that it works so well to predict stuff. Here is an illustration from the paper.

(The number given for IQ=115 is wrong. 115 is 2sd from 85, and so the number shud be 97.72%)

The idea is that there is a critical IQ for something. The simple interpretation of this is that above that IQ, no one makes the mistake in question (or does the thing in question, if we want to be more value-free in our description) and below it everybody makes it every time. This is very similar to the model i used to predict the average IQs of danish students given that some set percentage has to come thru the education. An alternative and more realistic interpretation is that at the critical IQ, the probability of making the mistake is .5. So that if the IQ is higher than the critical IQ, then the pr<.5, and if lower than the critical IQ, the pr>.5. If the distribution of making the mistake around the critical IQ is normally distributed, or just not skewed, the same results shud result.

This is pretty cool in itself, but the cooler thing is that the model successfully predicts (or postdicts in this case) the rates for all kinds of thing such as HIV-infection rates, crime rate, poverty rates, single-motherhood (but to a lesser degree, see the paper for discussion), opinion polls about various things such as belief in not guilty at the O. J. Simpson trial and belief in black conspiracy theories (i had never heard of these before reading the paper).

Reading the paper, i really want to extend this to more things. Coincidentally, i have long thought about the gender difference in belief in superstitious stuff like religion. In Denmark, there is a huge 20 point gender difference (~50% vs. ~70% for men and women, respectively). I also know that there is a small g difference between men and women. I want to see if the population-IQ-model can predict the data. It seems that it cannot fully account for the data. Using a 5 IQ gender difference, and a 15% increased variance in male IQ, i tried (trial and error) to see if i cud find a critical IQ that wud explain the data. The closest match seems to be around 104 which yields the predictions: 53.7% religiousness in males, and 67.9% in females. Rather close to the mark, but still there seems to be another factor. To make it worse, the gender difference is probably not a whole 5 points, but more likely in the 3-4 department, making the discrepancy between the model and the data to be explained even larger.

Some papers about gender differences in g/IQ

Helmuth Nyborg – Sex-related differences in general intelligence g, brain size, and social status

Males have greater g Sex differences in general mental ability from 100,000 17-to 18-year-olds on the Scholastic Assessment Test

R. Lynn and P. Irwing – Sex differences on the Progressive Matrices a meta-analysis

The first study does find rather large gender differences, both in g and in variance. Afaict, the second study does not report a difference in variance but cites some other studies that did. The last study also found an IQ gender difference but didn’t seem to report any variance difference.


Leave a Reply