Celebrating an early female pioneer: Barbara Stoddard Burks

Because I hate seeing solid contributions go unnoticed.

Few remember Barbara Stoddard Burks for the early pioneer of behavioral genetics she was. She doesn’t even have a Wikipedia article. I will fix that shortly.

In their article on the sociologist’s fallacy in interpretation of familial data, Rowe and Rodgers (1997) mention Burks thus:

The issues raised here are not new [interpretations of ‘environmental’ variables]; the methodological consequences of ignoring genetic effects have been recognized for many decades. For example, Burks, in her pioneering studies of foster children, noted that some part of the family environment–child IQ association could arise from their shared association with parental and child heredity (Burks, 1928; 1938). Indeed, Burks’ use of the method of path analysis, which had been invented not long before by the geneticist Sewall Wright (1923), was decades ahead of its time. Later in this article, Fig. 1 is a conceptual descendant of Burks’ (1938) path model. Her conclusion, that 75 –80% of IQ variance was due to innate and heritable causes and that family environmental effects were weak, did not lead social scientists, then or now, to routinely adopt behavior genetic strategies in the evaluation of family environmental effects.

Burks’ writings on behavioral genetics include:

On the relative contributions of nature and nurture to average group differences (1938)

Burks notes:

Far less satisfactory [than the study of individual differences] is the present position of the problem of average socio-economic group differences in intelligence — in fact it is only in the last few years that it has been generally recognized and discussed as a separate problem requiring its own techniques for solution. There is no simple correspondence between the contributions of nature and nurture to group and individual differences, but the same types of data are crucial for both problems. The writer has drawn upon two sources of data applicable to the problem of average group differences in IQ with respect to father’s occupation-her previous study conducted at Stanford University of the IQ’s of foster children and “own” children in relation to parental intelligence and home background, and the more recent study by Leahy at University of Minnesota dealing with the same type of material.

The central figure is this:

Burks reasons that if we can assume no selective placement, then we can use the variation in IQ as a function of occupation as a measure of the total environmental effect, and this can be compared with the variation in IQ as a function of occupation for the non-adoptees. The variation for non-adoptees is G+E, but that for adoptees is just E. Thus, the ratio of their variation can be used to estimate G and E. She calculates two different ways to summarize the variation. She does this for both of her samples and even does a meta-analysis using a sampling variance-based metric as weights. She finds that:

The two approaches yield very similar results. When it comes to IQ variation in children by parental occupational group, the methods agree that these are ~75% heritable.

Statistical hazards in nature-nurture investigations (1928)

This is a surprisingly refreshing and modern chapter which touches upon a wide variety of fallacies or problems in the practical use of statistics. Some of them by now have common names, but did not in 1928. As such, I have anachronistically named them if it has a modern name. Below I summarize some of the sections, but years will have to read the chapter for themselves. I recommend it!

1. Unrepresentative sampling due to selection effects such as survivor bias.

Her example:

It is found, for example, in attempting to determine how much native difference exists between the mental levels of various races, that in the higher school grades negro children are closer to the level of white children than is the case in the lower grades. On the face of it, this might appear to mean that schooling had wiped out the early difference between negro and white children. If the white and negro children in the higher grades were typical of children of their age, this would indeed be the case. But if it turns out that only the ablest negroes continue at school, it may be that nurture has had no effect at all in narrowing the gap between the abilities of the chosen samples of the two races.

2. The impossibility of drawing causal conclusions from ordinary family data with regards to nature-nurture

There are many types of study in which this hazard is inherent. Under some conditions the only way to obviate the difficulties is to find a new approach to the problem that will extricate the ’causes’ by experimental means.

Let us consider for a moment the correlations of .40 to .50 between the intelligence of siblings or between that of parents and offspring which have been reported by many different investigator’s. To what are these due! “To nature, to similar germ plasm,” answers the hereditarian. * ‘ To nurture, to the molding influences of home training and similar educational opportunities, ‘ ‘ answers the environmentalist.

Either answer is consistent with the observed facts, yet neither answer can be established through the facts immediately at hand. The hypothesis that family resemblance may be due to the combined forces of nature and nurture could also explain the observed facts. It therefore behooves us to defer interpretation until data from studies using different methods of attack untangle the real causes of family resemblance.

3. Causal interpretations of partial and multiple correlations; Sociologist’s fallacy

Gives perhaps the first path model-based explanation of the sociologist’s fallacy:

Next let us examine a less extreme situation. In an attempt to determine how much effect differences in environment have upon the development of children’s intelligence we might collect data from an unselected group of families giving the intelligence-test scores of a child and both his parents, and the family cultural status as measured on a specially devised scale. If the results checked with the trends indicated by various investigators, the correlations computed might not be far from the following:

Mental age of mid-parent and intelligence of child …….60

Mental age of mid-parent and cultural status ………..77

Intelligence of child and cultural status ……………48

Using these hypothetical figures, let us calculate the partial correlation between intelligence scores of mid-parent and child. In current terminology we “hold constant” the cultural status, or “eliminate the effect” of cultural status.

[equation that shows the partial correlation to be .42]

Should we then be justified in concluding that the real relationship between the intelligence of parents and their children, after similarities induced by similar cultural surroundings had been discounted, was measured by a correlation coefficient of only .42? Surely not, because we know that by the nature of partial correlation we have eliminated not only those portions of parents’ and children’s intelligence that may result from differences in cultural status, but those portions which contribute to cultural status as well. The situation is represented diagrammatically below.

Ip represents intelligence of mid-parent; Io, intelligence of child; S, cultural status; X, factors other than Ip or Io contributing to S; and Y, combinations of genes not showing in the measurement of Ip contributing to Io. It is readily seen that in applying the partial correlation formula to this situation we have “partialled out” too much, and that in any study of causation we are partialling out too much when we render constant factors which may in part or in whole be caused by either of the two factors whose true relationship is to be measured, or by still other unmeasured remote causes which also affect either of the two isolated factors.

6. Restriction of range

7. Just because a test is called ‘X measure’ does not mean it measures just X; On interpreting test differences as language bias

Thus, for example, we find studies that purport to measure the effect of language handicap on verbal intelligence tests scores by comparing the mental ages of foreign children earned on verbal and on non-verbal intelligence tests. The mental ages of children of certain low-testing nationalities commonly turn out to be closer to the norms of American children when measured on non-verbal tests than when measured on verbal ones. But in as much as verbal and non-verbal test scores, even for American children, seldom correlate with one another higher than .6 or .7, it is obvious that, although both types of tests are called ‘intelligence’ tests, they each measure about as much not held in common as they measure of what is held in common. Hence, it is not legitimate to infer from such data alone that language handicap accounts for the low scores of the foreign children on verbal tests

In modern language, we would say that subgroup differences in the ability patterns is also a possible explanation for the verbal vs. non-verbal gap, such as that seen for Asians in the US, even when they are native speakers of English.

10. The ecological fallacy

11. Giving change scores in raw scores without providing measures of dispersion

Few things are more exasperating to investigators who wish to make use of previous work in their field than to come upon studies which present increases in certain test scores in terms of points or of percents without indicating the significance of these increases in terms of group variability.

This problem is surprisingly common in old studies. The results are usually uninterpretable unless one can find archaic test manuals that report the dispersions. Unfortunately, the problem persisted because Shuey 1966’s review included a lot of these nearly uninterpretable studies.

12. Statistical certainty and designs with non-independent units

She mentions the example of sibling comparisons where one family may have 3 siblings thus giving rise to 3 comparisons (AB, AC, BC; 1 per child), while a family with 2 children will only contribute 1 comparison (AB; 0.5 per child). Clearly, one cannot just use unit weights because this means giving twice as much weight to the 3 sibling family relative to the 2 sibling family.

13. The generally untested assumption of cumulative environmental effects. [This one is still common.]

A pageful of citations could be presented in which some statement such as this is made: “It is fair to assume that the longer environment acts, the greater is its effect. ” By the use of this basic assumption, elaborate ‘proofs’ are sometimes built up to show that environment can or cannot account for such and such observed facts. The assumption may or may not be true ; again, it may be true under some conditions, false under others; it is far from axiomatic. Thus, in some situations it would seem at least as reasonable to postulate that environment quickly accomplishes its maximal effect, and if constant thereafter, is powerless further to add or detract.

14. A simple polygenic model for the inheritance of intelligence compared with Mendelian models that were popular at the time.

Where she shows that this fits the data much better than then popular Mendelian models. In fact, she uses a kind of cutoff approach now commonly used in modeling of mental illness (liability model).


Barbara died suddenly in 1943 (age 41), but she had already amassed a publication record of 59 paper and book chapters. A list of publications is given by Lewis Terman in a long, glowing obituary:

The death of Barbara Stoddard Burks on May 25, 1943, at the early age of 40 years, was a truly serious loss not only to psychology but also to biology, sociology, and education. Her record for creative productivity, which has rarely been equalled by one of her years either in quantity or quality, was made possible by an extraordinary combination of intellect, energy, and scientific enthusiasm. As Dr. Florence Goodenough has expressed it in a personal letter to me, “In the short span of her life Dr. Burks’ contributions would have done credit to one of double her age. Her zeal in research, her fine technical skill, and her clear insight into the basic principles underlying the problems which she set out to solve won the unqualified admiration of her colleagues both in this country and abroad.”

  • Terman, L. (1943). BARBARA STODDARD BURKS 1902-1943. Eugenical News, 28, 3-5.
  • Murphy, G. & Cook, R. (1943). Barbara Stoddard Burks: 1902-1943. The American Journal of Psychology Vol. 56, No. 4 (Oct., 1943), pp. 610-612

I have been unable to get a copy of the second obituary.

PS. See also another, arguably worse variant of contributions going unnoticed: Stigler’s law of eponymy.