The g factor, the science of mental ability – Arthur R. Jensen, ebook download pdf free
This is a very interesting book. Without a doubt the best about intelligence that i hav read so far. I definitely recommend reading it if one is interested in psychometrics. It can serve as a long, good, but a bit dated introduction to the subject. For shorter introductions, probably Gottfredson’s why g matters is better.
Quotes and comments below. Red text = quotes.
Galton had no tests for obtaining direct measurements of cognitive ability.
Yet he tried to estimate the mean levels of mental capacity possessed by different
racial and national groups on his interval scale of the normal curve. His esti
mates—many would say guesses—were based on his observations of people of
different races encountered on his extensive travels in Europe and Africa, on
anecdotal reports of other travelers, on the number and quality of the inventions
and intellectual accomplishments of different racial groups, and on the percent
age of eminent men in each group, culled from biographical sources. He ven
tured that the level of ability among the ancient Athenian Greeks averaged “ two
grades” higher than that of the average Englishmen of his own day. (Two grades
on Galton’s scale is equivalent to 20.9 IQ points.) Obviously, there is no pos
sibility of ever determining if Galton’s estimate was anywhere near correct. He
also estimated that African Negroes averaged “ at least two grades” (i.e., 1.39a,
or 20.9 IQ points) below the English average. This estimate appears remarkably
close to the results for phenotypic ability assessed by culture-reduced IQ tests.
Studies in sub-Saharan Africa indicate an average difference (on culture-reduced
nonverbal tests of reasoning) equivalent to 1.43a, or 21.5 IQ points between
blacks and whites.8 U.S. data from the Armed Forces Qualification Test (AFQT),
obtained in 1980 on large representative samples of black and white youths,
show an average difference of 1.36a (equivalent to 20.4 IQ points)—not far
from Galton’s estimate (1.39a, or 20.9 IQ points).9 But intuition and informed
guesses, though valuable in generating hypotheses, are never acceptable as ev
idence in scientific research. Present-day scientists, therefore, properly dismiss
Galton’s opinions on race. Except as hypotheses, their interest is now purely
biographical and historical.
yes there is. first, one can check the historical record to look for dysgenic effects. if the british are less smart than the ancient greeks, there wud probably hav been som dysgenic effects somwher in history. still, this is not a good method, since the population groups are somwhat different.
second, soon we will know the genes that cause different levels of intelligence. we can then analyze the remains of ancient greeks to see which genes they had. this shud giv a pretty good estimate, altho not perfect since, that 1) new mutations hav com by since then, 2) som gene variants hav perhaps disappeared, 3) the difficulty of getting a representativ sample of ancient greeks to test from, 4) the problems with getting good enuf quality DNA to run tests on. still, i dont think these are impossible to overcom, and i predict that som decent estimate can be made.
A General Factor Is Not Inevitable. Factor analysis is not by its nature
bound to produce a general factor regardless of the nature of the correlation
matrix that is analyzed. A general factor emerges from a hierarchical factor
analysis if, and only if, a general factor is truly latent in the particular correlation
matrix. A general factor derived from a hierarchical analysis should be based
on a matrix of positive correlations that has at least three latent roots (eigen
values) greater than 1.
For proof that a general factor is not inevitable, one need only turn to studies
of personality. The myriad of inventories that measure various personality traits
have been subjected to every type of factor analysis, yet no general factor has
ever emerged in the personality domain. There are, however, a great many first-
order group factors and several clearly identified second-order group factors, or
“ superfactors” (e.g., introversion-extraversion, neuroticism, and psychoticism),
but no general factor. In the abilities domain, on the other hand, a general factor,
g, always emerges, provided the number and variety of mental tests are sufficient
to allow a proper factor analysis. The domain of body measurements (including
every externally measurable feature of anatomy) when factor analyzed also
shows a large general factor (besides several small group factors). Similarly, the
correlations among various measures of athletic ability show a substantial gen
Jensen was wrong about this, altho the significance of that is disputed afaict. see:
How important is the General Factor of Personality? A General Critique (William Revelle and Joshua Wilt), PDF
In jobs where assurance of competence is absolutely critical, however, such
as airline pilots and nuclear reactor operators, government agencies seem to have
recognized that specific skills, no matter how well trained, though essential for
job performance, are risky if they are not accompanied by a fairly high level of
g. For example, the TVA, a leader in the selection and training of reactor op
erators, concluded that results of tests of mechanical aptitude and specific job
knowledge were inadequate for predicting an operator’s actual performance on
the job. A TVA task force on the selection and training of reactor operators
stated: “ intelligence will be stressed as one of the most important characteristics
of superior reactor operators.. . . intelligence distinguishes those who have
merely memorized a series of discrete manual operations from those who can
think through a problem and conceptualize solutions based on a fundamental
understanding of possible contingencies.” 161 This reminds one of Carl Bereiter’s
clever definition of “ intelligence” as “ what you use when you don’t know
what to do.”
funny and true
The causal underpinnings of mental development take place at the neurolog
ical level even in the absence of any specific environmental inputs such as those
that could possibly explain mental growth in something like figure copying in
terms of transfer from prior learning. The well-known “ Case of Isabel” is a
classic example.181 From birth to age six, Isabel was totally confined to a dimly
lighted attic room, where she lived alone with her deaf-mute mother, who was
her only social contact. Except for food, shelter, and the presence of her mother,
Isabel was reared in what amounted to a totally deprived environment. There
were no toys, picture books, or gadgets of any kind for her to play with. When
found by the authorities, at age six, Isabel was tested and found to have a mental
age of one year and seven months and an IQ of about 30, which is barely at
the imbecile level. In many ways she behaved like a very young child; she had
no speech and made only croaking sounds. When handed toys or other unfa
miliar objects, she would immediately put them in her mouth, as infants nor
mally do. Yet as soon as she was exposed to educational experiences she
acquired speech, vocabulary, and syntax at an astonishing rate and gained six
years of tested mental age within just two years. By the age of eight, she had
come up to a mental age of eight, and her level of achievement in school was
on a par with her age-mates. This means that her rate of mental development—
gaining six years of mental age in only two years—was three times faster than
that of the average child. As she approached the age of eight, however, her
mental development and scholastic performance drastically slowed down and
proceeded thereafter at the rate of an average child. She graduated from high
school as an average student.
What all this means to the g controversy is that the neurological basis of
information processing continued developing autonomously throughout the six
years of Isabel’s environmental deprivation, so that as soon as she was exposed
to a normal environment she was able to learn those things for which she was
developmentally “ ready” at an extraordinarily fast rate, far beyond the rate for
typically reared children over the period of six years during which their mental
age normally increases from two to eight years. But the fast rate of manifest
mental development slowed down to an average rate at the point where the level
of mental development caught up with the level of neurological development.
Clearly, the rate of mental development during childhood is not just the result
of accumulating various learned skills that transfer to the acquisition of new
skills, but is largely based on the maturation of neural structures.
this reminds me of the person who suggested that we delay teaching math in schools for the same reason. it is simply more time-effective, and time is costly, both for the child who has limited freedom in the time spent in school, and for soceity becus that time cud hav been spent on teaching somthing else, or not spent at all and thus saved money on teachers.
the idea is that som math subjects takes very long to teach, say, 8 year olds, but can rapidly to taught to 12 year olds. so, using som invented numbers, the idea is that instead of spending 10 hours teaching long division to 8 year olds, we cud spend 2 hours teaching long division to 12 year olds, thus saving 8 eights that can be either used on somthing else that can be taught easily to 8 year olds, or simply freeing up the time for non-teaching activities.
see: www.inference.phy.cam.ac.uk/sanjoy/benezet/ for the original papers
Perhaps the most problematic test of overlapping neural elements posited by
the sampling theory would be to find two (or more) abilities, say, A and B, that
are highly correlated in the general population, and then find some individuals
in whom ability A is severely impaired without there being any impairment of
ability B. For example, looking back at Figure 5.2, which illustrates sampling
theory, we see a large area of overlap between the elements in Test A and the
elements in Test B. But if many of the elements in A are eliminated, some of
its elements that are shared with the correlated Test B will also be eliminated,
and so performance on Test B (and also on Test C in this diagram) will be
diminished accordingly. Yet it has been noted that there are cases of extreme
impairment in a particular ability due to brain damage, or sensory deprivation
due to blindness or deafness, or a failure in development of a certain ability due
to certain chromosomal anomalies, without any sign of a corresponding deficit
in other highly correlated abilities.22 On this point, behavioral geneticists Will-
erman and Bailey comment: “ Correlations between phenotypically different
mental tests may arise, not because of any causal connection among the mental
elements required for correct solutions or because of the physical sharing of
neural tissue, but because each test in part requires the same ‘qualities’ of brain
for successful performance. For example, the efficiency of neural conduction or
the extent of neuronal arborization may be correlated in different parts of the
brain because of a similar epigenetic matrix, not because of concurrent func
tional overlap.” 22 A simple analogy to this would be two independent electric
motors (analogous to specific brain functions) that perform different functions
both running off the same battery (analogous to g). As the battery runs down,
both motors slow down at the same rate in performing their functions, which
are thus perfectly correlated although the motors themselves have no parts in
common. But a malfunction of one machine would have no effect on the other
machine, although a sampling theory would have predicted impaired perform
ance for both machines.
i know its only an analogy, but whether ther ar one or two motors tapping from one battery might hav an effect on their speed. that depends on the setup, i think.
Gc is most highly loaded in tests based on scholastic knowledge and cultural
content where the relation-eduction demands of the items are fairly simple. Here
are two examples of verbal analogy problems, both of about equal difficulty in
terms of percentage of correct responses in the English-speaking general pop
ulation, but the first is more highly loaded on G f and the second is more highly
loaded on Gc.
1. Temperature is to cold as Height is to
(a) hot (b) inches (c) size (d) tall (e) weight
2. Bizet is to Carmen as Verdi is to
(a) Aida (b) Elektra (c) Lakme (d) Manon (e) Tosca
first one, i wanted to answer <small>, since <cold> is on the bottum of the scale of temperature, so i wanted somthing that was on the bottom of the scale of height. but ther is no such option, but tall is also on the scale of height, just as cold is on the scale of temperature. with no other better option, i went with (d), which was correct.
second one, however, made no sense to me. i did look for patterns in spelling, vowels, length, etc., found nothing. i then googled it. its composers and their operas.
Another blood variable of interest is the amount of uric acid in the blood
(serum urate level). Many studies have shown it to have only a slight positive
correlation with IQ. But it is considerably more correlated with measures of
ambition and achievement. Uric acid, which has a chemical structure similar to
caffeine, seems to act as a brain stimulant, and its stimulating effect over the
course of the individual’s life span results in more notable achievements than
are seen in persons of comparable IQ, social and cultural background, and gen
eral life-style, but who have a lower serum urate level. High school students
with elevated serum urate levels, for example, obtain higher grades than their
IQ-matched peers with an average or below-average serum urate level, and,
amusingly, one study found a positive correlation between university professors’
serum urate levels and their publication rates. The undesirable aspect of high
serum urate level is that it predisposes to gout. In fact, that is how the association
was originally discovered. The English scientist Havelock Ellis, in studying the
lives and accomplishments of the most famous Britishers, discovered that they
had a much higher incidence of gout than occurs in the general population.
Asthma and other allergies have a much-higher-than-average frequency in
children with higher IQs (over 130), particularly those who are mathematically
gifted, and this is an intrinsic relationship. The intellectually gifted show some
15 to 20 percent more allergies than their siblings and parents. The gifted are
also more apt to be left-handed, as are the mentally retarded; the reason seems
to be that the IQ variance of left-handed persons is slightly greater than that of
the right-handed, hence more of the left-handed are found in the lower and upper
extremes of the normal distribution of IQ.
Then there are also a number of odd and less-well-established physical cor
relates of IQ that have each shown up in only one or two studies, such as vital
capacity (i.e., the amount of air that can be expelled from the lungs), handgrip
strength, symmetrical facial features, light hair color, light eye color, above-
average basic metabolic rate (all these are positively correlated with IQ), and
being unable to taste the synthetic chemical phenylthiocarbamide (nontasters are
higher both in g and in spatial ability than tasters; the two types do not differ
in tests of clerical speed and accuracy). The correlations are small and it is not
yet known whether any of them are within-family correlations. Therefore, no
causal connection with g has been established.
Finally, there is substantial evidence of a positive relation between g and
general health or physical well-being. In a very large national sample of high
school students (about 10,000 of each sex) there was a correlation of +.381
between a forty-three-item health questionnaire and the composite score on a
large number of diverse mental tests, which is virtually a measure of g. By
comparison, the correlation between the health index and the students’ socio
economic status (SES) was only +.222. Partialing out g leaves a very small
correlation ( + .076) between SES and health status. In contrast, the correlation
between health and g when SES is partialed out is +.326.
how very curius!
Certainly psychometric tests were never constructed with the intention of
measuring inbreeding depression. Yet they most certainly do. At least fourteen
studies of the effects of inbreeding on mental ability test scores—mostly IQ—
have been reported in the literature.132′ Without exception, all of the studies show
inbreeding depression both of IQ and of IQ-correlated variables such as scho
lastic achievement. As predicted by genetic theory, the IQ variance of the inbred
is greater than that of the noninbred samples. Moreover, the degree to which
IQ is depressed is an increasing monotonic function of the coefficient of in-
breeding. The severest effects are seen in the offspring of first-degree incestuous
matings (e.g., father-daughter, brother-sister); the effect is much less for first-
cousin matings and still less for second-cousin matings. The degree of IQ de
pression for first cousins is about half a standard deviation (seven or eight IQ
In most of these studies, social class and other environmental factors are well
controlled. Studies in Muslim populations in the Middle East and India are
especially pertinent. Cousin marriages there are more prevalent in the higher
social classes, as a means of keeping wealth in family lines, so inbreeding and
high SES would tend to have opposite and canceling effects. The observed effect
of inbreeding depression on IQ in the studies conducted in these groups,
therefore, cannot be attributed to the environmental effects of SES that are often
claimed to explain IQ differences between socioeconomically advantaged and
These studies unquestionably show inbreeding depression for IQ and other
single measures of mental ability. The next question, then, concerns the extent
to which g itself is affected by inbreeding. Inbreeding depression could be
mainly manifested in factors other than g, possibly even in each test’s specificity.
To answer this question, we can apply the method of correlated vectors to in-
breeding data based on a suitable battery of diverse tests from which g can be
extracted in a hierarchical factor analysis. I performed these analyses1331 for the
several large samples of children born to first-and second-cousin matings in
Japan, for whom the effects of inbreeding were intensively studied by geneticists
William Schull and James Neel (1965). All of the inbred children and compa
rable control groups of noninbred children were tested on the Japanese version
of the Wechsler Intelligence Scale for Children (WISC). The correlations among
the eleven subtests of the WISC were subjected to a hierarchical factor analysis,
separately for boys and girls, and for different age groups, and the overall av
erage g loadings were obtained as the most reliable estimates of g for each
subtest. The analysis revealed the typical factor structure of the WISC—a large
g factor and two significant group factors: Verbal and Spatial (Performance).
(The Memory factor could not emerge because the Digit Span subtest was not
used.) Schull and Neel had determined an index of inbreeding depression on
each of the subtests. In each subject sample, the column vector of the eleven
subtests’ g loadings was correlated with the column vector of the subtests’ index
of inbreeding depression (ID). (Subtest reliabilities were partialed out of these
correlations.) The resulting rank-order correlation between subtests’ g loadings
and their degree of inbreeding depression was + .79 (p < .025). The correlation
of ID with the Verbal factor loadings (independent of g) was +.50 and with the
Spatial (or Performance) factor the correlation was —.46. (The latter two cor
relations are nonsignificant, each with p < .05.) Although this negative corre
lation of ID with the spatial factor (independent of g) falls short of significance,
the negative correlation was found in all four independent samples. Moreover,
it is consistent with the hypothesis that spatial visualization ability is affected
by an X-linked recessive allele.34 Therefore, it is probably not a fluke.
A more recent study1351 of inbreeding depression, performed in India, was
based entirely on the male offspring of first-cousin parents and a control group
of the male offspring of genetically unrelated parents. Because no children of
second-cousin marriages were included, the degree of inbreeding depression was
considerably greater than in the previous study, which included offspring of
second-cousin marriages. The average inbreeding effect on the WISC-R Full
Scale IQ was about ten points, or about two-third of a standard deviation.1361
The inbreeding index was reported for the ten subtests of the WISC-R used in
this study. To apply the method of correlated vectors, however, the correlations
among the subtests for this sample are needed to calculate their g loadings.
Because these correlations were not reported, I have used the g loadings obtained
from a hierarchical factor analysis of the 1,868 white subjects in the WISC-R
standardization sample.1371 The column vector of these g loadings and the column
vector of the ID index have a rank-order correlation (with the tests’ reliability
coefficients partialed out) of +.83 (p < .01), which is only slightly larger than
the corresponding correlation between the g and ID vectors in the Japanese
In sum, then, the g factor significantly predicts the degree to which perform
ance on various mental tests is affected by inbreeding depression, a theoretically
predictable effect for traits that manifest genetic dominance. The larger a test’s
g loading, the greater is the depression of the test scores of the inbred offspring
of consanguineous parents, as compared with the scores of noninbred persons.
The evidence in these studies of inbreeding rules out environmental variables
as contributing to the observed depression of test scores. Environmental differ
ences were controlled statistically, or by matching the inbred and noninbred
groups on relevant indices of environmental advantage.
pretty large effects. the footnote with the 14 studies mentioned is:
Adams & Neel, 1967; Afzal, 1988; Afzal & Sinha, 1984; Agrawal et al., 1984;
Badaruddoza & Afzil, 1993; Bashi, 1977; Book, 1957; Carter, 1967; Cohen et al., 1963;
Inbaraj & Rao, 1978; Neel, et al., 1970; Schull & Neel, 1965; Seemanova, 1971; Slatis
& Hoene, 1961.
Semantic Verification Test. The SVT uses the binary response console (Fig
ure 8.3) and a computer display screen. Following the preparatory “ beep,” a
simple statement appears on the screen. The statement involves the relative
positions of the three letters A, B, C as they may appear (equally spaced) in a
horizontal array. Each trial uses one of the six possible permutations of these
three letters chosen at random. The statement appears on the screen for three
seconds, allowing more than enough time for the subject to read it. There are
fourteen possible statements of the following types: “ A after B,” “ C before
A,” “ A between B and C,” “ B first,” “ B last,” “ C before A and B,” “ C
after B and A” ; and the negative form of each of these statements, for instance,
“ A not after B.” Following the three-second appearance of one of these state
ments, the screen goes blank for one second and then one of the permutations
of the letters A B C appears. The subject responds by pressing either the TRUE
or FALSE button, depending on whether the positions of the letters does or does
not agree with the immediately previous statement.
Although the SVT is the most complex of the many ECTs that have been
tried in my lab, the average RT for university students is still less than 1 second.
The various “ problems” differ widely in difficulty, with average RTs ranging
from 650 msec to 1,400 msec. Negative statements take about 200 msec longer
than the corresponding positive statements. MT, on the other hand, is virtually
constant across conditions, indicating that it represents something other than
speed of information processing.
The overall median RT and RTSD as measured in the SVT each correlates
about —.50 with scores on the Raven’s Advanced Progressive Matrices given
without time limit. The average RT on the SVT also shows large differences
between Navy recruits and university students,1201 and between academically
gifted children and their less gifted siblings.1211 The fact that there is a within-
families correlation between RT and IQ indicates that these variables are intrin
sically and functionally related.
One study20 reveals that the average processing time for each of the fourteen
types of SVT statements in university students predicts the difficulty level of
the statements (in terms of error responses) in children (third-graders) who were
given the SVT as a nonspeeded paper-and-pencil test. While the SVT is of such
trivial difficulty for college students that individual differences are much more
reliably reflected by RT rather than by errors, the SVT items are relatively
difficult for young children. Even when they take the SVT as a nonspeeded
paper-and-pencil test, young children make errors on about 20 percent of the
trials. (The few university students who made even a single error under these
conditions, given as a pretest, were screened out.) The fact that the rank order
of the children’s error rates on the various types of SVT statements closely
corresponds to the rank order of the college students’ average RTs on the same
statements indicates that item difficulty is related to speed of processing, even
when the test is nonspeeded.
It appears that if information exceeds a critical level of complexity for the in
dividual, the individual’s speed of processing is too slow to handle the infor
mation all at once; the system becomes overloaded and processing breaks
down, with resulting errors, even for nonspeeded tests on which subjects are
told to take all the time they need. There are some items in Raven’s Advanced
Matrices, for example, that the majority of college students cannot solve with
greater than chance success, even when given any amount of time, although the
problems do not call for the retrieval of any particular knowledge. As already
noted, the scores on such nonspeeded tests are correlated with the speed of in
formation processing in simple ECTs that are easily performed by all subjects
in the study.
interesting test. the threshold hypothesis is also interesting for makers of IQ tests.
There are many other kinds of simple tasks that do not resemble the con
tents of conventional psychometric tests but that have significant correlations
with IQ. Many studies have confirmed Spearman’s finding that pitch discrim
ination is g-loaded, and other musical discriminations, in duration, timbre,
rhythmic pattern, pitch interval, and harmony, are correlated with IQ, indepen
dently of musical training.28 The strength of certain optical illusions is also
significantly related to IQ.1291 Surprisingly, higher-IQ subjects experience cer
tain illusions more strongly than subjects with lower IQ, probably because
seeing the illusion implies a greater amount of mental transformation of the
stimulus, and tasks that involve transformation of information (e.g., backward
digit span) are typically more g loaded than tasks involving less transforma
tion of the input (e.g., forward digit span). The positive correlation between
IQ and susceptibility to illusions is consistent with the fact that susceptibility
to optical illusions also increases with age, from childhood to maturity, and
then decreases in old age—the same trajectory we see for raw-score perform
ance on IQ tests and for speed and intraindividual consistency of RT in ECTs.
The speed and consistency of information processing generally show an in
verted U curve across the life span.
Jensen mentions the en.wikipedia.org/wiki/Yerkes-Dodson_law
interesting. i link to Wikipedia since i think its explanation of the law is better than Jensens, who just briefly mentions it.
[...Localized damage to the brain
areas that normally subserve one of these group factors can leave the person
severely impaired in the expression of the abilities loaded on the group factor,
but with little or no impairment of abilities that are loaded on other group factors
or on g.]
A classic example of this is females who are born with a chromosomal anom
aly known as Turner’s syndrome.1701 Instead of having the two normal female
sex chromosomes (designated XX), they lack one X chromosome (hence are
designated XO). Provided no spatial visualization tests are included in the IQ
battery, the IQs of these women (and presumably their levels of g) are normally
distributed and virtually indistinguishable from that of the general population.
Yet their performance on all tests that are highly loaded on the spatial-
visualization factor is extremely low, typically borderline retarded, even in
Turner’s syndrome women with verbal IQs above 130. It is as if their level of
g is almost totally unreflected in their level of performance on spatial tasks.
It is much harder to imagine the behavior of persons who are especially
deficient in all abilities involving g and all of the major group factors, but have
only one group factor that remains intact. In our everyday experience, persons
who are highly verbal, fluent, articulate, and use a highly varied vocabulary,
speaking with perfect syntax and appropriate expression, are judged to be of at
least average or probably superior IQ. But there is a rare and, until recently,
little-known genetic anomaly, Williams syndrome,1711 in which the above-listed
characteristics of high verbal ability are present in persons who are otherwise
severely mentally deficient, with IQs averaging about 50. In most ways, Wil
liams syndrome persons appear to behave with no more general capability of
getting along in the world than most other persons with similarly low IQs. As
adults, they display only the most rudimentary scholastic skills and must live
under supervision. Only their spoken verbal ability has been spared by this
genetic defect. But their verbal ability appears to be “ hollow” with respect to
g. They speak in complete, often complex, sentences, with good syntax, and
even use unusual words appropriately. (They do surprisingly well on the Pea
body Picture Vocabulary Test.) In response to a series of pictures, they can tell
a connected and fully elaborated story, accompanied by appropriate, if somewhat
exaggerated, emotional expression. Yet they have exceedingly little ability to
reason, or to explain or summarize the meaning of what they say. On most
spatial ability tests they generally perform on a par with Down syndrome persons
of comparable IQ, but they also differ markedly from Down persons in peculiar
ways. Williams syndrome subjects are more handicapped than IQ-matched
Down subjects in figure copying and block designs.
Comparing Turner’s syndrome with Williams syndrome obviously suggests
the generalization that a severe deficiency of one group factor in the presence
of an average level of g is far less a handicap than an intact group factor in the
presence of a very low level of g.
never heard of Williams syndrome befor.
The correlation of IQ with grades and achievement test scores is highest (.60
to .70) in elementary school, which includes virtually the entire child population
and hence the full range of mental ability. At each more advanced educational
level, more and more pupils from the lower end of the IQ distribution drop out,
thereby restricting the range of IQs. The average validity coefficients decrease
accordingly: high school (.50 to .60), college (.40 to .50), graduate school (.30
to .40). All of these are quite high, as validity coefficients go, but they permit
far less than accurate prediction of a specific individual. (The standard error of
estimate is quite large for validity coefficients in this range.)
interesting. one thing that i hav been thinking about is that my GPA thruout my life has always been a bit abov average, but not close to the top. given that the intelligence requirement for each new step on the way thru the school system increases, one wud hav expected a drop in GPA, but no such thing happened. in fact, its the other way around. my GPA is the danish elementary school is 9.3 (9th grade) the average is ~8.1. this includes grades from non-intellectual subjects such as the ‘subject’ of having a nice hand-writing (yes seriusly). in 10th grade my average was 8.7, and the average is ~6.6. the max is 13 in all cases, altho normally grades abov 11 wer not given.
in gymnasiet (high school equiv.ish), my GPA was 7.8 and the average is 7.0. the slightly slower grades is becus the system was changed from a 13-step to a 7-step scale. and for comparison reasons, one can note that i went to HTX which has lower grades. the percentile level is 65th.
my university grades befor dropping out of filosofy were rather good, lots of 10’s, but i dont know the average, so cant compare. i suspect they were abov average again.
Unless an individual has made the transition from word reading to reading
comprehension of sentences and paragraphs, reading is neither pleasurable nor
practically useful. Few adults with an IQ of eighty (the tenth percentile of the
overall population norm) ever make the transition from word reading skill to
reading comprehension. The problem of adult illiteracy (defined as less than a
fourth-grade level of reading comprehension) in a society that provides an ele
mentary school education to virtually its entire population is therefore largely a
problem of the lower segment of the population distribution of g. In the vast
majority of people with low reading comprehension, the problem is not word
reading per se, but lack of comprehension. These individuals score about the
same on tests of reading comprehension even if the test paragraphs are read
aloud to them by the examiner. In other words, individual differences in oral
comprehension and in reading comprehension are highly correlated.12’1
80.. but the american black average is only about 85. is it really true that ~37% of them ar too dull to learn to read properly? compared with ~10% of whites.
Virtually every type of work calls for behavior that is guided by cognitive
processes. As all such processes reflect g to some extent, work proficiency is g
loaded. The degree depends on the level of novelty and cognitive complexity
the job demands. No job is so simple as to be totally without a cognitive com
ponent. Several decades of empirical studies have shown thousands of correla
tions of various mental tests with work proficiency. One of the most important
conclusions that can be drawn from all this research is that mental ability tests
in general have a higher success rate in predicting job performance than any
other variables that have been researched in this context, including (in descend
ing order of average predictive validity) skill testing, reference checks, class
rank or grade-point average, experience, interview, education, and interest meas
ures.1221 In recent years, one personality constellation, characterized as “ consci
entiousness,” has emerged near the top of the list (just after general mental
ability) as a predictor of occupational success.
reminds me that i ought to look into this field of psychology. its called I/O psychology. som time back i talked with a phd (i think) on 4chan who studied that area. he said that if he had his way, he wud just rely on g alone to predict job performance, training etc. he recommended me a textbook, which i found on the internet.
Psychology Applied to Work, An Introduction to Industrial and Organizational Psychology – Paul M. Muchinsky
it seems decent.
A person cannot perform a job successfully without the specific knowledge
required by the job. Possibly such job knowledge could be acquired on the job
after a long period of trial-and-error learning. For all but the very simplest jobs,
however, trial-and-error learning is simply too costly, both in time and in errors.
Job training inculcates the basic knowledge much more efficiently, provided that
later on-the-job experience further enhances the knowledge or skills acquired in
prior job training. Because knowledge and skill acquisition depend on learning,
and because the rate of learning is related to g, it is a reasonable hypothesis that
g should be an effective predictor of individuals’ relative success in any specific
The best studies for testing this hypothesis have been performed in the armed
forces. Many thousands of recruits have been selected for entering different
training programs for dozens of highly specialized jobs based on their perform
ance on a variety of mental tests. As the amount of time for training is limited,
efficiency dictates assigning military personnel to the various training schools
so as to maximize the number who can complete the training successfully and
minimize the number who fail in any given specialized school. When a failed
trainee must be rerouted to a different training school better suited to his apti
tude, it wastes time and money. Because the various schools make quite differing
demands on cognitive abilities, the armed services employ psychometric re
searchers to develop and validate tests to best predict an individual’s probability
of success in one or another of the various specialized schools.
one is tempted to say ”common sense”, but apparently, only the military dares to do such things.
A rough analogy may help to make the essential point. Suppose that for some
reason it was impossible to measure persons’ heights directly in the usual way,
with a measuring stick. However, we still could accurately measure the length
of the shadow cast by each person when the person is standing outdoors in the
sunlight. Provided everyone’s shadow is measured at the same time of day, at
the same day of the year, and at the same latitude on the earth’s surface, the
shadow measurements would show exactly the same correlations with persons’
weight, shoe size, suit or dress size, as if we had measured everyone directly
with a yardstick; and the shadow measurements could be used to predict per
fectly whether or not a given person had to stoop when walking through a door
that is only 5 ‘/2 -feet high. However, if one group of persons’ shadows were
measured at 9:00 a .m . and another group’s at 10:00 a .m ., the pooled measure
ments would show a much smaller correlation with weight and other factors
than if they were all measured at the same time, date, and place, and the meas
urements would have poor validity for predicting which persons could walk
through a 5 ‘/2 -foot door without stooping. We would say, correctly, that these
measurements are biased. In order to make them usefully accurate as predictors
of a person’s weight and so forth, we would have to know the time the person’s
shadow was measured and could then add or subtract a value that would adjust
the measurement so as to make it commensurate with measurements obtained
at some other specific time, date, and location. This procedure would permit the
standardized shadow measurements of height, which in principle would be as
good as the measurements obtained directly with a measuring stick.
Standardized IQs are somewhat analogous to the standardized shadow meas
urements of height, while the raw scores on IQ tests are more analogous to the
raw measurements of the shadows themselves. If we naively remain unaware
that the shadow measurements vary with the time of day, the day of the year,
and the degrees of latitude, our raw measurements would prove practically
worthless for comparing individuals or groups tested at different times, dates,
or places. Correlations and predictions could be accurate only within each unique
group of persons whose shadows were measured at the same time, date, and
place. Since psychologists do not yet have the equivalent of a yardstick for
measuring mental ability directly, their vehicles of mental measurement—IQ
scores—are necessarily “ shadow” measurements, as in our height analogy, al
beit with amply demonstrated practical predictive validity and construct validity
within certain temporal and cultural limits.
interesting. however, biologically based tests shud allow for absolut measurement, say tests based on RT in ECTs, or tests based on the amount of mylianation in the brain, or brain ph levels, brain size via brain imaging scans if we can make them better measurements of g, etc.
Many possible factors determine whether a person passes or fails a particular
test item. Does the person understand the item at all (e.g., “What is the sum of
all the latent roots of a 7 X 7 R matrix?” )? Has the person acquired the specific
knowledge called for by the item (e.g., “Who wrote Faust?”), or perhaps has
he acquired it in the past and has since forgotten it? Did the person really know
the answer, but just couldn’t recall it at the moment of being tested? Does the
item call for a cognitive skill the person either never acquired or has forgotten
through disuse (e.g., “ How much of a whole apple is two-thirds of one-half of
the apple?” )? Does the person understand the problem and know how to solve
it, but is unable to do it within the allotted time limit (e.g., substituting the
corresponding letter of the alphabet for each of the numbers from one to twenty-
six listed in a random order in one minute)? Or even when there is a liberal
time limit does the person give up on the item or just guess at the answer
prematurely, perhaps because the item looks too complicated at first glance (e.g.,
“ If it takes six garden hoses, all running for three hours and thirty minutes to
fill a tank, how many additional hoses would be needed to fill the tank in thirty
4) #hose*time=tank size
21 is the size of the tank
21=0.5*#hose, solve #hose
36 more hoses
The only study I have found that investigated whether there has been a secular
change (over thirty years) in the heritability of g-loaded test scores concluded
that “ the results revealed no unambiguous evidence for secular trends in the
heritability of intelligence test scores.” 1351 However, the heritability coefficients
(based on twenty-two same-age cohort samples of MZ and DZ male twins born
in Norway between 1930 and 1960) showed some statistically reliable nonlinear
trends over the thirty-year period, as shown in Figure 10.2. The overall trend
line goes equally down-up-down-up with heritability coefficients ranging from
slightly above .80 to slightly below .40. The heritability coefficient was the same
for the cohort born in 1930 as for the cohort born in 1960 (for both, h2 = .80).
The authors offer only weak ad hoc speculations about possible causes of this
erratic fluctuation of h2 across 22 points in time.
the hole is the german occupation of norway. the data from the 30s make sense to me, the depression wud result in civil unrest and the changing up of society. after a period of such, heritabilities shud stabilize again, as seen in the after war period. i dont understand the 50s down swing in heritability.
so, i thought it might be somthing economic. i gathered GDP data, and looked at the data. nope, not true.
data from 1901 to 2000 looks like this:
gdp norway 50s
doesnt fit with the GDP hypothesis at all, except for missing data in the war.
i dunno, perhaps www.newsinenglish.no/2010/06/16/the-50s-in-norway-werent-so-nifty/
the authors of the study that found the drop in heritability also dont know ”We are, however, quite at a loss in explaining the dip from about 1950 to 1954. Thus, we feel that the best strategy at present is to leave the issue of secular trends open. ”
On the question of secular trends in the heritability of intelligence scores A study of Norwegian twins
Head Start. The federal preschool intervention known as Head Start, which
has been in continual existence now since 1964, is undoubtedly the largest-
scale, though not the most intensive, educational intervention program ever un
dertaken, with an annual expenditure over $2 billion. The program is aimed at
improving the health status and the learning and social skills of preschoolers
from poor backgrounds so they can begin regular school more on a par with
children from more privileged backgrounds. The intervention is typically short
term, with various programs lasting anywhere from a few months to two years.
The general conclusion of the hundreds of studies based on Head Start data
is that the program has little, if any, effect on IQ or scholastic achievement that
endures beyond more than two to three years after exposure to Head Start. The
program does, however, have some potential health benefits, such as inoculations
of enrollees against common childhood diseases and improved nutrition (by
school-provided breakfast or lunch). The documented behavioral effects are less
retention-in-grade and lower dropout rates. The cause(s) of these effects are
uncertain. Because eligible children were not randomly enrolled in Head Start,
but were selected by parents and program administrators, these scholastic cor
relates of Head Start are uninterpretable from a causal standpoint. Selection,
rather than direct causation by the educational intervention itself, could be the
explanation of Head Start’s beneficial outcomes.
crazy amount of money spent for som slight health benefits. perhaps ther is a cheaper way to get such benefits.
The Milwaukee Project. Aside from Head Start, this is the most highly
publicized of all intervention experiments. It was the most intensive and exten
sive educational intervention ever conducted for which the final results have
been published.55 It was also the most costly single experiment in the history of
psychology and education—over $14 million. In terms of the highest peak of
IQ gains for the seventeen children in the treatment condition (before the gains
began to vanish), the cost was an estimated $23,000 per IQ point per child.
holy shit. even tho i think iv seen this figur befor (in The g Factor by Chris Brand).
Jensen also doesnt mention the end of the project, but Wikipedia does:
The Milwaukee Project’s claimed success was celebrated in the popular media and by famous psychologists. However, later in the project Rick Heber, the principal investigator, was discharged from the University of Wisconsin–Madison and convicted and imprisoned for large-scale abuse of federal funding for private gain. Two of Heber’s colleagues in the project were also convicted for similar abuses. The project’s results were not published in any refereed scientific journals, and Heber did not respond to requests from colleagues for raw data and technical details of the study. Consequently, even the existence of the project as described by Heber has been called into question. Nevertheless, many college textbooks in psychology and education have uncritically reported the project’s results.
this reminds me why open data is necessary in science.
[The Abecedarian Early Intervention Project.]
Both the T and C groups (each with about fifty subjects) were given age-
appropriate mental tests (Bayley, Stanford-Binet, McCarthy, WPPSI) at
six-month intervals from age six months to sixty months. The important com
parisons here are the mean T-C differences at each testing. (Because the test
scores do not have the same factor composition across this wide age range,
the absolute scores of the T group alone are not as informative of the efficacy
of the intervention as are the mean T-C differences.) At every testing from six
months to five years of age, the T group outperformed the C group, and the
overall average T-C difference (103.3 — 95.5 = 7.8 IQ points) was highly
significant (p < .001). Peculiarly, however, the largest T-C differences (aver
aging fifteen IQ points) occurred between eighteen and thirty-six months of
age and then declined during the last two years of intervention. At sixty
months, the average T-C difference was 7.5 IQ points. This decrease might
simply reflect the fact that with the children’s increasing age the tests become
increasingly more g-Ioaded. The tests used before two or three years of age
measure mainly perceptual-motor functions that have relatively little g satura
tion. Only later does g becomes the predominant component of variance in
IQ. In follow-up studies at eight and twelve years of age, the T-C difference
on the WISC-R was about five IQ points,1571 a difference that has remained up
to age fifteen. At the last reported testing, the T-C difference was 4.6 IQ
points, or a difference of 0.35ct. Scholastic achievement test scores showed a
somewhat larger effect of the intervention up to age fifteen.1571 The interven
tion effect on other criteria of the project’s success was demonstrated by the
decreased percentage of children who repeated at least one grade by age
twelve (T = 28 percent, C = 55 percent) and the percentage of children with
borderline or retarded intelligence (IQ < 85) (T = 12.8 percent, C = 44.2
Thus this five-year program of intensive intervention beginning in early in
fancy increased IQ (at age fifteen years) by about five points. Judging from a
comparable gain in scholastic achievement, the effect had broad transfer, sug
gesting that it probably raised the level of g to some extent. The finding that
the T subjects did better than the C subjects on a battery of Piaget’s tests of
conservation, which reflect important stages in mental development, is further
evidence. The Piagetian tests are not only very different in task demands from
anything in the conventional IQ tests used in the conventional assessments, but
are also highly g loaded.1571 The mean T-C difference on the Piagetian conser
vation tests was equal to 0.33a (equivalent to five IQ points). Assuming that
the instructional materials in the intervention program did not closely resemble
Piaget’s tests, it is a warranted conclusion that the intervention appreciably
raised the Level of g.
im still skeptical as to the g effects. id like to see the data about them as adults, and a larger sample size.
again, Wikipedia has mor on the issue, both positiv and negativ:
Follow-up assessment of the participants involved in the project has been ongoing. So far, outcomes have been measured at ages 3, 4, 5, 6.5, 8, 12, 15, 21, and 30. The areas covered were cognitive functioning, academic skills, educational attainment, employment, parenthood, and social adjustment. The significant findings of the experiment were as follows:
Impact of child care/preschool on reading and math achievement, and cognitive ability, at age 21:
- An increase of 1.8 grade levels in reading achievement
- An increase of 1.3 grade levels in math achievement
- A modest increase in Full-Scale IQ (4.4 points), and in Verbal IQ (4.2 points).
Impact of child care/preschool on life outcomes at age 21:
- Completion of a half-year more of education
- Much higher percentage enrolled in school at age 21 (42 percent vs. 20 percent)
- Much higher percentage attended, or still attending, a 4-year college (36 percent vs. 14 percent)
- Much higher percentage engaged in skilled jobs (47 percent vs. 27 percent)
- Much lower percentage of teen-aged parents (26 percent vs. 45 percent)
- Reduction of criminal activity
Statistically significant outcomes at age 30:
- Four times more likely to have graduated from a four-year college (23 percent vs. 6 percent)
- More likely to have been employed consistently over the previous two years (74 percent vs. 53 percent)
- Five times less likely to have used public assistance in the previous seven years (4 percent vs. 20 percent)
- Delayed becoming parents by average of almost two years
(Most recent information from Developmental Psychology, January 18, 2012, cited in uncnews.unc.edu, January 19, 2012)
The project concluded that high quality, educational child care from early infancy was therefore of utmost importance.
Other, less intensive programs, notably the Head Start Program, but also others, have not been as successful. It may be that they provided too little too late compared with the Abecedarian program.
Some researchers have advised caution about the reported positive results of the project. Among other things, they have pointed out analytical discrepancies in published reports, including unexplained changes in sample sizes between different assessments and publications. It has also been noted that the intervention group’s reported 4.6 point advantage in mean IQ at age 15 was not statistically significant. Herman Spitz has noted that a mean IQ difference of similar magnitude to the final difference between the intervention and control groups was apparent already at age six months, indicating that “4 1/2 years of massive intervention ended with virtually no effect.” Spitz has suggested that the IQ difference between the intervention and control groups may have been present from the outset due to faulty randomization.
not quite sure what to think. the sample sizes ar still kind small, and if Spitz is right in his criticism, the studies hav not shown much.
the reason that im skeptical to begin with is that the modern twin studies show, that shared environment, which is what these studies change to a large degree, has no effect on adult IQ.
in any case, if it requires so expensiv spendings to get slightly less dumb kids, its hard to justify as a public policy. at the very least, id like to see the calculation that finds that this has a net positiv benefit for society. it is possible, for instance, becus crime rates ar (supposedly) down, and job retention up which leads to mor taxes being paid, and so on.
Error distractors in multiple-choice answers are of interest as a method of
discovering bias. When a person fails to select the correct answer but instead
chooses one of the alternative erroneous responses (called “ distractors” ) offered
for an item in a multiple-choice test, the person’s incorrect choice is not random,
but is about as reliable as is the choice of the correct answer. In other words,
error responses, like correct responses, are not just a matter of chance, but reflect
certain information processes (or the failure of certain crucial steps in infor
mation processing) that lead the person to choose not just any distractor, but a
particular one. Some types of errors result from a solution strategy that is more
naive or less sophisticated than other types of errors. For example, consider the
following test item:
If you mix a pint of water at 50° temperature with two pints of water at 80°
measured on the same thermometer, what will be the temperature of the mix
ture? (a) 65°, (b) 70°, (c) 90°, (d) 130°, (e) Can’t say without knowing
whether the temperatures are Centigrade or Fahrenheit.
We see that the four distractors differ in the level of sophistication in mental
processing that would lead to their choice. The most naive distractor, for ex
ample, is D, which is arrived at by simple addition of 50° and 80°. The answer
A at least shows that the subject realized the necessity for averaging the tem
peratures. The answer 90° is the most sophisticated distractor, as it reveals that
the subject had a glimmer of the necessity for a weighted average (i.e., 50° +
8072 = 90°) but didn’t know how to go about calculating it. (The correct
answer, of course, is B, because the weighted average is [1 pint X 50° + 2
pints X 80°]/3 pints = 70°.) Preference for selecting different distractors changes
across age groups, with younger children being attracted to the less sophisticated
type of distractor, as indicated by comparing the percentage of children in dif
ferent age groups that select each distractor. The kinds of errors made, therefore,
appear to reflect something about the children’s level of cognitive development.
What is termed a cline results where groups overlap at their fuzzy boundaries
in some characteristic, with intermediate gradations of the phenotypic charac
teristic, often making the classification of many individuals ambiguous or even
impossible, unless they are classified by some arbitrary rule that ignores biology.
The fact that there are intermediate gradations or blends between racial groups,
however, does not contradict the genetic and statistical concept of race. The
different colors of a rainbow do not consist of discrete bands but are a perfect
continuum, yet we readily distinguish different regions of this continuum as
blue, green, yellow, and red, and we effectively classify many things according
to these colors. The validity of such distinctions and of the categories based on
them obviously need not require that they form perfectly discrete Platonic cat
while the rainbow analogy works to som extent, it is not that good. the reason is that with rainbows, all the colors (groups) ar on a continuum in such a way that ther isnt a blend between every two colors (groups). this is not how races work, as ther is always the possibility of a blend between any two groups, even odd groups such as amerindians and aboriginals.
Of the approximately 100,000 human polymorphic genes, about 50,000 are
functional in the brain and about 30,000 are unique to brain functions. The
brain is by far the structurally and functionally most complex organ in the human
body and the greater part of this complexity resides in the neural structures of
the cerebral hemispheres, which, in humans, are much larger relative to total
brain size than in any other species. A general principle of neural organization
states that, within a given species, the size and complexity of a structure reflect
the behavioral importance of that structure. The reason, again, is that structure
and function have evolved conjointly as an integrated adaptive mechanism. But
as there are only some 50,000 genes involved in the brain’s development and
there are at least 200 billion neurons and trillions of synaptic connections in the
brain, it is clear that any single gene must influence some huge number of
neurons— not just any neurons selected at random, but complex systems of
neurons organized to serve special functions related to behavioral capacities.
It is extremely improbable that the evolution of racial differences since the
advent of Homo sapiens excluded allelic changes only in those 50,000 genes
that are involved with the brain.
the same point was made, altho less technically, in Hjernevask. ther is no good apriori reason to think that natural selection for som reason only worked on non-brain, non-behavioral genes. it simply makes no sense at all to suppose that.
Bear in mind that, from the standpoint of natural selection, a larger brain
size (and its corresponding larger head size) is in many ways decidedly disad
vantageous. A large brain is metabolically very expensive, requiring a high-
calorie diet. Though the human brain is less than 2 percent of total body weight,
it accounts for some 20 percent of the body’s basal metabolic rate (BMR). In
other primates, the brain accounts for about 10 percent of the BMR, and for
most carnivores, less than 5 percent. A larger head also greatly increases the
difficulty of giving birth and incurs much greater risk of perinatal trauma or
even fetal death, which are much more frequent in humans than in any other
animal species. A larger head also puts a greater strain on the skeletal and
muscular support. Further, it increases the chances of being fatally hit by an
enemy’s club or missile. Despite such disadvantages of larger head size, the
human brain, in fact, evolved markedly in size, with its cortical layer accom
modating to a relatively lesser increase in head size by becoming highly con
voluted in the endocranial vault. In the evolution of the brain, the effects of
natural selection had to have reflected the net selective pressures that made an
increase in brain size disadvantageous versus those that were advantageous. The
advantages obviously outweighed the disadvantages to some degree or the in
crease in hominid brain size would not have occurred.
this brain must hav been very useful for somthing. if som of this use has to do with non-social things, like environment, one wud expect to see different levels of ‘brain adaptation’ due to the relative differences in selection pressure in populations that evolved in different environments.
How then can the default hypothesis be tested empirically? It is tested exactly
as is any other scientific hypothesis; no hypothesis is regarded as scientific unless
predictions derived from it are capable of risking refutation by an empirical test.
Certain predictions can be made from the default hypothesis that are capable of
empirical test. I f the observed result differs significantly from the prediction, the
hypothesis is considered disproved, unless it can be shown that the tested pre
diction was an incorrect deduction from the hypothesis, or that there are artifacts
in the data or methodological flaws in their analysis that could account for the
observed result. If the observed result does in fact accord with the prediction,
the hypothesis survives, although it cannot be said to be proven. This is because
it is logically impossible to prove the null hypothesis, which states that there is
no difference between the predicted and the observed result. If there is an al
ternative hypothesis, it can also be tested against the same observed result.
For example, if we hypothesize that no tiger is living in the Sherwood Forest
and a hundred people searching the forest fail to find a tiger, we have not proved
the null hypothesis, because the searchers might have failed to look in the right
places. I f someone actually found a tiger in the forest, however, the hypothesis
is absolutely disproved. The alternative hypothesis is that a tiger does live in
the forest; finding a tiger clearly proves the hypothesis. The failure of searchers
to find the tiger decreases the probability of its existence, and the more search
ing, the lower is the probability, but it can never prove the tiger’s nonexistence.
Similarly, the default hypothesis predicts certain outcomes under specified
conditions. If the observed outcome does not differ significantly from the pre
dicted outcomes, the default hypothesis is upheld but not proved. If the predic
tion differs significantly from the observed result, the hypothesis must be
rejected. Typically, it is modified to accord better with the existing evidence,
and then its modified predictions are empirically tested with new data. If it
survives numerous tests, it conventionally becomes a “ fact.” In this sense, for
example, it is a “ fact” that the earth revolves around the sun, and it is a “ fact”
that all present-day organisms have evolved from primitive forms.
meh, mediocre or bad filosofy of science.
the problem with this data is that the women were not don having children. the data is from women aged 34. since especially smart women (and so mor whites) hav children later than that age, their fertility estimates ar spuriusly low. see also the data in Intelligence: A Unifying Construct for the Social Sciences (Richard Lynn and Tatu Vanhanen, 2012).
Whites perform significantly better than blacks on the subtests called Com
prehension, Block Design, Object Assembly, and Mazes. The latter three tests
are loaded on the spatial visualization factor of the WISC-R. Blacks perform
significantly better than whites on Arithmetic and Digit Span. Both of these tests
are loaded on the short-term memory factor of the WISC-R. (As the test of
arithmetic reasoning is given orally, the subject must remember the key elements
of the problem long enough to solve it.) It is noteworthy that Vocabulary is the
one test that shows zero W-B difference when g is removed. Along with Infor
mation and Similarities, which even show a slight (but nonsignificant) advantage
for blacks, these are the subtests most often claimed to be culturally biased
against blacks. The same profile differences on the WISC-R were found in
another study|8lbl based on 270 whites and 270 blacks who were perfectly
matched on Full Scale IQ.
seems inconsistent with typical environment only theories.