Clear Language, Clear Mind

October 25, 2018

Terman’s belief in high heritability of intelligence: unwarranted confidence?

Russell Warne has a new piece out reviewing the life and criticism of Lewis Terman. While Warne generally defends Terman against criticism, he thinks that Terman went beyond his evidence regarding his strong belief in the heritability of traits. He writes:

For all the previously mentioned flaws, one criticism of Terman seems absent from the previous literature: Terman’s willingness to form a forceful opinion when the data were not strong enough to support this degree of confidence. For example, Terman stated, “All the available facts that science has to offer support the Galtonian theory that mental abilities are chiefly a matter of original endowment” (Terman, 1922e, p. 659). In response to sentiments like this, Minton (1988) stated, “Terman never provided unequivocal evidence that IQs reflected native ability. Based on his weddedness to Hall and Galton’s biological determinism, he simply assumed that IQs were genetically determined” (p. 200). Indeed, no psychologist at the time had data that could separate the influences of heredity and environment on intelligence test scores, so anyone who had a strong opinion about the topic—including Terman— lacked the empirical evidence to support their views. The first behavioral genetics studies that could estimate the impacts of genetics and environment would be published a few years later (Burks, 1928/1973; Hildreth, 1925; Tallman, 1928; Wingfield, 1928). 3 In 1922, there were cor – relational data that supported Galton’s and Terman’s views, but these same data just as easily supported theories of purely environmental causes of interindividual differences for intelligence. Terman seems to have downplayed this possibility in the first half of his career when discussing the relative importance of nature and nurture (e.g., Terman, 1906, p. 372, 1922e, 1928a; Terman et al., 1915). He had a penchant for interpreting correlational data as being causal in nature—a tendency that extended beyond his research on intelligence (Hollingworth, 1939). [my bolding]

Warne is referring to the lack of modern style twin and adoption studies at the time Terman was writing. However, the absence of these do not imply that evidence was generally unavailable. It was however more indirectly. I can think of several lines of evidence:

  1. Family studies of physical features and various animal breeding studies show similar and strong relationships, and these are unlikely to be significantly environmental in origin. This increases the prior that the same will be found for other human traits. This argument was made by Galton in 1865 (Hereditary Character and Talent):

    As no experiment of this description has ever been made, I cannot appeal to its success. I can only say that the general resemblances in mental qualities between parents and offspring, in man and brute, are every whit as near as the resemblance of their physical features; and I must leave the existence of actual laws in the former case to be a matter of inference from the analogy of the latter.

  2. Evolution cannot select for traits that are not heritable, so if it has evolved — as one could know at the time from the study of ancient crania — it must have been or still be heritable. From a variety of data one can infer that intelligence is not likely to have reached fixation at some maximum value, hence must still be heritable.
  3. Animal breeding, e.g. in dogs, was known to be able to breed for any desired trait, including intelligence and other psychological traits. This is only possible if traits are heritable, and thus, by generalization to humans, so are they expected to be as well.
  4. The main alternative hypothesis, that social class or parenting causes children’s intelligence, was already known at the time to be unsatisfactory on its own. Children of very rich people did not show extreme rates of giftedness as one would expect on a wealth model. Terman’s student Catherine Cox found (in 1926) that illustrious people were usually very gifted already in early childhood, before environmental effects could have had major cumulative effect. Furthermore, children within a family were known to vary widely, which cannot be plausibly explained by any parenting effects, but genetics is the obvious hypothesis (it could also be random).
  5. Galton had in fact carried out the first crude adoption study by investigating the relative eminence of popes’ biological children vs. adopted sons, often distant relatives. This argument is described in Hereditary Genius (1869):

    It is difficult to specify two large classes of men, with equal social advantages, in one of which they have high hereditary gifts, while in the other they have not. I must not compare the sons of eminent men with those of non–eminent, because much which I should ascribe to breed, others might ascribe to parental encouragement and example. Therefore, I will compare the sons of eminent men with the adopted sons of Popes and other dignitaries of the Roman Catholic Church. The practice of nepotism among ecclesiastics is universal. It consists in their giving those social helps to a nephew, or other more distant relative, that ordinary people give to their children. Now, I shall show abundantly in the course of this book, that the nephew of an eminent man has far less chance of becoming eminent than a son, and that a more remote kinsman has far less chance than a nephew. We may therefore make a very fair comparison, for the purposes of my argument, between the success of the sons of eminent men and that of the nephews or more distant relatives, who stand in the place of sons to the high unmarried ecclesiastics of the Romish Church. If social help is really of the highest importance, the nephews of the Popes will attain eminence as frequently, or nearly so, as the sons of other eminent men; otherwise, they will not.
    Are, then, the nephews, &c. of the Popes, on the whole, as highly distinguished as are the sons of other equally eminent men? I answer, decidedly not. There have been a few Popes who were offshoots of illustrious races, such as that of the Medici, but in the enormous majority of cases the Pope is the ablest member of his family. I do not profess to have worked up the kinships of the Italians with any especial care, but I have seen amply enough of them, to justify me in saying that the individuals whose advancement has been due to nepotism, are curiously undistinguished. The very common combination of an able son and an eminent parent, is not matched, in the case of high Romish ecclesiastics, by an eminent nephew and an eminent uncle. The social helps are the same, but hereditary gifts are wanting in the latter case.

I submit that the above lines of argument were available to Terman early in 1900s, as some of them were to Galton and Darwin in the late second half of 1800s, and a reasonable person would take these into account and adjust the prior for heritability of human intelligence upwards to a high value. In fact, his beliefs have been mostly vindicated by later studies, which is evidence that he either didn’t base his beliefs on as thin air as supposed, or that the common perceptions of psychology of that time were remarkably accurate. I’ll let the reader pick their favorite interpretation!

Between group heritability

Aside from the within group heritability, Warne faults Terman for confidently saying that between group gaps in USA were also due to genetics. He quotes Terman as:

How much of this inferiority [in intelligence test scores of Hispanic groups] is due to the language handicap and to other environmental factors it is impossible to say, but the relatively good showing made by certain other immigrant groups similarly handicapped would suggest that the true causes lie deeper than environment. (Terman, 1926, p. 57)

The quote choice is curious considering the parts I highlighted. Warne also quotes another chunk:

. . . represent the level of intelligence which is very, very common among Spanish-Indian and Mexican families of the Southwest and also among negroes. Their dullness seems to be racial, or at least inherent in the family stocks from which they come. The fact that one meets this type with such extraordinary frequency among Indians, Mexicans, and negroes suggests quite forcibly that the whole question of racial differences in mental traits will have to be taken up anew and by experimental methods. The writer predicts that when this is done there will be discovered enormously significant racial differences in general intelligence, differences which cannot be wiped out by any scheme of mental culture. (Terman, 1916, pp. 91-92

But it has similar problems. Predicting something is not the same as holding a confident view about it. Writing that something “suggests quite forcibly” is hardly being extremely confident in a belief.

February 13, 2017

Getting personality right

Filed under: Differential psychology/psychometrics — Tags: , , , , — Emil O. W. Kirkegaard @ 02:51

This post covers some stuff already covered by others, but more briefly. Two studies of interest:

  1. Riemann, R., Angleitner, A., & Strelau, J. (1997). Genetic and Environmental Influences on Personality: A Study of Twins Reared Together Using the Self- and Peer Report NEO-FFI Scales. Journal of Personality, 65(3), 449–475.
  2. Connelly, B. S., & Ones, D. S. (2010). An other perspective on personality: Meta-analytic integration of observers’ accuracy and predictive validity. Psychological Bulletin, 136(6), 1092–1122.

Study 1 – the heritability of personality

Heritabilities for personality traits — usually OCEAN — are commonly given as around 50%. A typical citation for this is Bouchard’s 2004 review which produced this table:

bouchard review heritability

The values range from 42% to 57%, the mean of which is exactly 50%. (There’s a big meta-analysis from 2015 finding a mean of 40% but divergent results from adoption (22%) vs. MZT-DZT (47%) designs.) These results come from standard twin studies: MZT DZT comparisons. This design underestimates heritability when there is measurement error in the variables. Despite this, researchers routinely ignore measurement error and I have no idea why. As usual, Jensen got it right early on, such as in his 1969 review, by adjusting for measurement error in his review of the IQ findings, so why don’t they follow his example?

Self-rating measures of personality suffer from not just regular, random measurement error, but also have systematic measurement error (bias): people are not able to rate their own personality as well as other people who know them can. They introduce self-rating method variance into the data, and this variance is not so heritable. There is a twin study that used other-ratings of personality and when they used them or combined them with self-ratings, the heritabilities went up:

h pers 1 h pers 2 h pers 3

So with self-report they found H 42-56%, mean = 51%. Other-report: 57-81, mean = 66%, combined: 66-79, mean = 71%. (I used the AE models’ results when possible.) In fact, these analyses did not correct for regular measurement error either, so the heritabilities are higher still according to these data, likely into the 80%s area. This is the same territory as cognitive ability.

Main caveat: unreplicated study based on n = 964 cases. That sounds like a lot, but it is not for twin studies. Estimates of H rely on four measurements, so sampling error adds up quickly. (One has to estimate the intraclass correlations for MZs and DZs which are based on case pairs. Then one has to estimate the difference between these correlations.)

Jayman pointed me towards a replication of this finding in another and larger sample.

  • Riemann, R., & Kandler, C. (2010). Construct validation using multitrait-multimethod-twin data: The case of a general factor of personality. European Journal of Personality, 24(3), 258–277.

They fit a number of models to their data with higher higher order factors, big two and/or GFP. Unfortunately, they only report the behavioral genetics model parameters from the best fitting model which turned out to be a 5 + 2 model with cross loadings. The heritabilities from this were: E 86%, O 92%, ES (-N) 59%, A 85%, C 81%, Plasticity 50% and Stability 40%. If we use just the OCEAN traits as we did before, the mean heritability is 81%, with ES being the obvious outlier some 20%points below the others. Heritability of the big two were similar to the normal estimates for OCEAN for whatever reason. It’s not clear what the heritabilities of OCEAN traits would be if one used just the 5 factor model.

Study 2 – validity of self- vs. other-reported personality

If we accept the higher heritability of other-rated personality and that the cause of that is measurement error and bias, then we would also expect the (predictive) validity of other-rated personality to be stronger. At least, unless we think self-rating bias has as strong validity as the personality traits themselves. As it happens, there is a large meta-analysis on this topic concluding exactly that. They present their results in 3 large tables, but I’ve rearranged them in a smaller table for convenience:

Trait Rater Outcome: stranger impressions Outcome: academic achievement Outcome: job performance
Emotional stability Other 0.41 0.46 0.37
Emotional stability Self 0.20 0.25 0.12
Extraversion Other 0.46 0.52 0.18
Extraversion Self 0.37 0.09 0.12
Openness Other 0.58 0.29 0.45
Openness Self 0.42 0.09 0.05
Agreeableness Other 0.34 0.02 0.31
Agreeableness Self 0.26 0.06 0.13
Conscientiousness Other 0.42 0.69 0.55
Conscientiousness Self 0.27 0.22 0.23

[For academic achievement, I used the self-report value with the largest n. This had the effect of maximizing the correlations for self-report.]

These correlations are corrected for measurement error in both variables, so they should be quite comparable with regards to true correlations. The other-report correlations are systematically larger. It is easy to see if one plots them.


If we average validities within outcomes and calculate the over/self ratios, these are 2.8, 2.9 and 1.5, mean = 2.4. Other-report is much more valid.

October 17, 2016

Notes on Steve Hsu’s second interview w/ Daphne Martschenko

Back in January, Steve Hsu did an interview with Daphne Martschenko who is a phd candidate at Cambridge in education. She’s basically doing science journalism on behavioral genetics as far as I can tell. Now there’s a new interview up. I have some comments to it because Steve was a little too nice.

Bias from twin studies

A never ending topic of contention in the last 100 years or so has been the ability of standard twin studies (MZ-DZ reared together comparison) to correctly estimate causation parameters, in particular heritability, shared environment (whatever siblings have in common), and everything else (‘unshared environment’).

Assortative mating

Assortative mating – the tendency of mates to be similar in traits – is generally a thing, not just for humans. But for humans, it seems to be strongest for age (.77) and religiousness (.75), substantial for educational attainment, cognitive ability (.48), political preferences (~.35), criminality (~.40), and apparently weak for personality (r’s 0-.20), at least when we rely on self-reported, high-order (5-6 factors) scores. There’s even weak to moderate assortative mating for seemingly irrelevant physical features like ear lobes.

Assortative mating negatively biases heritability because it makes DZ twins more genetically similar on the trait of interest. Why? Because if parents are similar on a trait, and this trait is heritable, the parents are actually slightly related (inbred) with regards to that trait (and in general). To the degree that this is true, the genetic similarity of the DZ twins will be above 50%. However, MZ twins cannot get more similar since they are already at 100% (minus a few somatic mutations). The usual method to estimate heritability from standard twin studies relies on the assumption that there is no assortative mating because this means the DZ’s are 50% genetically alike while the MZ’s are 100%. This means that one can just double the difference between their correlations to estimate heritability: H=2(MZ-DZ). In the presence of assortative mating, one has to more than double this value because the genetic difference in difference in less than 50%. E.g. if the DZ’s are actually 60% related for the trait, the difference between their relatedness and the MZs relatedness is only 40%points. We need to multiple 40% 2.5 times instead of 2 to get to 100%. If we suppose for that a particular trait, MZs correlate at .85, DZs at .60, then the standard estimate of heritability would be 2(.85-.60)=50%. But if there is assortative mating as we assumed, we need to do 2.5(.85-.60)=62.5%. The bias in this was about -20%.

For more details, see:

Measurement error downwards bias

Measurement error — noise in the measurement of traits — systematically depresses correlations between any two variables. This also applies to twin studies. Measurement error makes the MZ and DZ correlations smaller which means that heritability estimates get lower too, and the everything else category grows. Let’s say we have the above correlations, but we know that our tests only have .90 reliability. What we need to do is correct the observed correlations first: .85/sqrt(.90*.90)=.944, and .60/sqrt(.90*.90)=.667. Then we plug the numbers back in: 2(.944-.667)=55.4%. But then, we also should take into account assortative mating: 2.5(.944-.667)=69.3%.

Thus, taking into account both assortative mating (possibly an unrealistic amount, not sure how to calculate this exactly) and realistic amounts of measurement error yields a heritability estimate that’s about 40% larger (50% vs. 69.3%).

Special MZ environment possible upwards bias

The standard twin design assumes that the environments of MZs are no more similar than those of DZs (in the equations, both MZ’s and DZ’s C are correlated at 1.00). Common sense has it that this is probably not entirely true. MZs look and act extremely similar and so it is possible that people — including their parents — treat them more similarly. This may cause an extra environmental effect that makes them more similar. This would cause upwards bias in the heritability estimates because the effect of special MZ environment is confounded with the genetic effect. The size of the bias depends on how much more similar the MZ environment is compared to the DZ, and on the effect size of this environment effect. If this sibling environment-type effect is weak to begin with, even a strong extra MZ environment would only weakly bias the estimates. (Again, I think too complicated to give a numeric example.) That’s the reason I said it’s a possible bias. The bias from assortative mating and measurement error are known, not merely possible.

How can we find about this possible bias? There are multiple options. One is:

In 1968, Scarr proposed a test of the EEA which examines the impact of phenotypic similarity in twins of perceived versus true zygosity. We apply this test for the EEA to five common psychiatric disorders (major depression, generalized anxiety disorder, phobia, bulimia, and alcoholism), as assessed by personal interview, in 1030 female-female twin pairs from the Virginia Twin Registry with known zygosity. We use a newly developed model-fitting approach which treats perceived zygosity as a form of specified familial environment. In 158 of the 1030 pairs (15.3%), one or both twins disagreed with the project-assigned zygosity. Model fitting provided no evidence for a significant influence of perceived zygosity on twin resemblance for any of the five disorders.

There’s a replication here, n=882. There’s a bunch of more studies too.

There’s another kind of study as well: sometimes twins get misclassified, including by their parents and themselves. Meaning, they think they are MZ, but they are really DZ, or the other way around. There’s been a number of such studies and there’s a 2013 review of them. No surprises.

What about womb effects? Twins may or may not share placentas and chorions. Pretty much no effect of these either.

Other family members?

Twins reared together are just convenient, but one can use pretty much any pair of family relations. Here’s a large study using siblings and half-siblings:

Twin studies have been criticized for upwardly biased estimates that might contribute to the missing heritability problem.

We identified, from the general Swedish population born 1960-1990, informative sibships containing a proband, one reared-together full- or half-sibling and a full-, step- or half-sibling with varying degrees of childhood cohabitation with the proband. Estimates of genetic, shared and individual specific environment for drug abuse (DA), alcohol use disorder (AUD) and criminal behavior (CB), assessed from medical, legal or pharmacy registries, were obtained using Mplus.

Aggregate estimates of additive genetic effects for DA, AUD and CB obtained separately in males and females varied from 0.46 to 0.73 and agreed with those obtained from monozygotic and dizygotic twins from the same population. Of 54 heritability estimates from individual classes of informative sibling trios (3 syndromes × 9 classes of trios × 2 sexes), heritability estimates from the siblings were lower, tied and higher than those from obtained from twins in 26, one and 27 comparisons, respectively. By contrast, of 54 shared environmental estimates, 33 were lower than those found in twins, one tied and 20 were higher.

With adequate information, human populations can provide many methods for estimating genetic and shared environmental effects. For the three externalizing syndromes examined, concerns that heritability estimates from twin studies are upwardly biased or were not generalizable to more typical kinds of siblings were not supported. Overestimation of heritability from twin studies is not a likely explanation for the missing heritability problem.

One can also use family members not reared together such as regular adoptions and the rare MZ adoptions. Generally these approaches also find similar results.

To be fair, here’s a large Icelandic study that used a variety of relationships that found lower heritability estimates (about 75% of standard) and higher shared environment estimates. Not sure why this is the case.

Still, most of the evidence fits with no particular bias from which kinds of family relationship are used. Measurement error, however, always biases heritability downwards.

Heritability of cognitive ability, and parental S

Daphne refers to the famous Turkheimer 2003 study. Since this is an interaction finding (interactions have low priors) and suited left-wingers, this finding spread like wildfire despite other equally large or larger studies finding no such effect (some found reverse effects too). Finally, 12 years later, someone did a meta-analysis, and we can now see that: 1) this finding is apparently only seen in US samples, and 2) it was wildly overestimated by the initial study. There’s some rather extreme citation bias there too. I wonder why.


Even if this interaction effect was large, it is mostly pointless because shared environment effects go away with age for this trait.

Heritability of social success

Daphne talks about the heritability of social success, e.g. occupations, education and so on. While one can estimate the influence of cognitive ability indirectly, it’s easier to just estimate heritability directly, which was first done 40 years ago. The finding of heritable social success is more a less a given because: 1) if success is a function of psychological traits, 2) psychological traits are moderately to highly heritable, then 3) success is pretty likely to be fairly heritable too. Herrnstein’s syllogism.

This field is in general not well researched because it lacks a unifying theory (which the S factor model provides, kinda sorta). However, here’s some findings:

Education: meta-analysis ~ 40% H, C = 36%, E=25% (yes, apparently, they sum to 101). 44% in Taubman 1967.

Income: 42% in NLSY. 25-54% in Finnish twins female/male. 48% in Taubman 1976.

Occupation: 43% among the 1944-1960 cohort.

‘Neighborhood deprivation’: 65% nation-wide Swedish dataset, 71% in Scotland.

Still missing a more general, S factor heritability study (this can be done using NLSY links). I will make this prediction: higher S loading → higher heritability. E.g. single year income has poor S loading because it’s a bad indicator of S, and hence heritability. Heritability of S comes from highly heritable traits that include cognitive ability, personality and interests.

Self-identified race/ethnicity, race and cognitive ability

The forbidden topic!

I am fortunate that these are my views because they are politically correct and garner me praise, speaking and writing invitations, and book adoptions at the same time those who disagree with me are demeaned, ostracized, and in some cases threatened with tenure revocation even though their science is as reasonable as mine. Don’t get me wrong, I think their positions are incorrect and I have relished aiming my pen at what I regard to be their leaps of logic and flawed analysis. But they deeply believe that I am wrong. The problem is that I can tell my side far more easily than they are permitted to tell theirs, through invitations to speak at meetings, to contribute chapters and articles, etc.. This offends my sense of fairness and cannot be good for science. I think Saletan would agree with me on this.

See also: Double standards.

As for genomic ancestry and self-identity, there are lots of studies. Here’s some results from the PING dataset:


(Working paper here)

We relate ancestry to cognitive ability, while controlling for SIRE (and all the cultural effects related to SIRE), then we get:


The method is not entirely satisfactory. Better would be if we also had skin brightness from the persons (PING does not seem to have this), so we could control for any effects of skin-based racism, however implausible. Better yet, we could get a large set of racially admixed siblings. These siblings vary slightly in their genomic ancestry in imperceptible ways (e.g. one is 55% African, another is 50%). A genetic model predicts this small variation to be linked to cognitive ability. This design is neat because by virtue of using siblings, it controls for family effects. The results of such a study would be pretty conclusive. It’s not at all impossible, a very similar study has already been done for height. All that is needed is some large (10k? Maybe Gwern should do a power analysis? :) ) sibling dataset with genomic data, cognitive ability and skin brightness.

Or, we use exploit the fact that while in the total population, skin brightness and racial ancestry are moderately to strongly correlated (r ≈ .50), they aren’t so between siblings (I can’t find an empirical demonstration of this). However, if skin-based racism is really responsible for cognitive differences, then skin brightness differences between siblings should show correlations to cognitive ability differences, educational attainment, income, and so on. Yet they don’t. And if we control for cognitive ability, skin brightness has positive correlations to education. The opposite of what colorism predicts. In general, colorism is not a good hypothesis. The race gaps are smaller for things like income, whereas these are the easiest things to discriminate for.

GWAS results/polygenic scores, by the way, replicate/work partly in non-European samples too, to a degree that depends on the relatedness according to an animal breeding study. This is because the causal tagging depends on LD patterns. SNPs are generally not causal (or so we think!), they are just close to the causal variants on the genome.

Isn’t it more than a little odd that a topic that so many people consider so important and which is so easy to get to the bottom of, attracts so little research attention? This debate is easy to settle: get access to medical datasets with the required variables.

June 16, 2016

Measurement error and behavioral genetics in criminology

I am watching Brian Boutwell’s (Twitter, RG) talk at a recent conference and this got me thinking.

What are we measuring?

As far as I know, there are typically two outcome variables used in criminological studies:

  1. Official records convictions.
  2. Self-reported criminal or anti-social behavior.

But exactly what trait are we trying to measure? It seems to me that we (or I am!) are really interested in measuring something like tendency to break laws that are harmful to other people. Harmful is here used in a broad sense. Stealing something may not always cause someone harm, but it does deprive them (usually) unfairly of their property. Stealing is not always wrong, but it is usually wrong. Let’s call the construct we want to measure harmful criminal behavior.

Measurement error: two types

Before going on, it is necessary to distinguish between the two types of measurement error in classical test theory:

  1. Random measurement error.
  2. Systematic measurement error.

Random measurement error is by definition error in measurement that is not correlated with anything else at all (sampling error aside). Conceptually we can think of it as adding random noise to our measurements. A simple, every-day example of this would be a study where we examine the relationship between height and GPA for ground/elementary school students. Suppose we obtain access to a school and we measure the height of all the students using a measurement tape. Then we obtain their GPAs from the school administration. Random measurement error here would be if we used dice to pick random numbers and added/subtracted these to each student’s height.

Systematic measurement error (also called bias) is different. Suppose we are measuring the ability of persons to sneak past a guard post because we want to recruit a team of James Bond-type super spies. We conduct the experiment by having people try to sneak past a guard post. Because we have a lot of people to test, our experiment is carried out all day beginning in the early morning and ending in the evening. Each individual has to try three times to sneak past the guard post and we measure their ability as the number of times they sneaked past (so 0-3 are possible scores) We assign their trials in order of their birthdays: people born early in the year take their trials in the early morning. Because it is easier to see when the sun is higher in the sky, the individuals who happen to be born later or very early in the year have an advantage: it is more difficult for the guards to spot them when it is darker. Someone who successfully sneaked past the guards three times in the evening is not necessarily at the same skill level as someone who sneaked three times around noon. There is a systematic error in the measurement of sneaking ability related to the time of testing, and it is furthermore related to the persons’ birthday.

Problems with official records

Using official records as a measure of harmful criminal behavior has a big problem: they often include convictions for things that aren’t wrong (e.g. drug use or sex work). Ideally, we don’t care about convictions for things like smoking cannabis because in a sense, this isn’t a real crime: it’s just the government that is evil. In the same way that homosexual sex or even oral sex is not a crime anymore, and was not a real crime back when it was illegal (overview of US ‘sodomy’ laws). There is a moral dimension as to what to one is trying to measure if one does not just want to go with the construct of ‘any criminal behavior that the present day state in this country happen to have criminalized’.

Furthermore, official records are based on court decisions (and pleas). Court decisions are in turn the result of the police taking up a case. If the police are biased — rightly or wrongly — in their decision about which cases to pursue, this will give rise to systematic measurement error.

Since the police does not have infinite resources, they will not pursue every case they know of. They probably won’t even pursue every case they know of they think they can win in court. There is thus an inherent randomness in which cases they will pursue. i.e. random measurement error.

Worse, which cases the police pursues may depend on irrelevant things like whether the police leadership has set a goal for the number of cases of a given type that must be pursued every year. This practice seems to be fairly common, and yet it results in serious distortions in the use of police resources. In Denmark, the police often have these goals about biking violations (say, biking on the sidewalk). The result is that in December (if the goal is based on a year-to-year basis), if they are not close to meeting their goal, the police leadership will divert resources away from more important crimes, say, break-ins, to hand out fines for people breaking biking laws. They may also lower the bar as to what counts as a violation.

Even worse, they may focus on targeting violations that are not wrong they are easy to pursue. One police officer gave the following story (anonymously in order to prevent reprisals from the leaders!) in response to a parliament discussion of the topic:

“When we are told that we must write 120 bikers [hand out fines to] the next 14 days, then we don’t place ourselves in the pedestrian area while there are pedestrians, and when the bikers may cause problems. No, we take them in the morning when they bike thru the empty pedestrian area on their way to work, because then we get more quickly to the 120 number. In other words, we do it for the numbers’ sake and not for the sake of traffic safety.”

This kind of police behavior induce both random and systematic measurement error in the official records. For instance, people who happen to bike to work and whose work is on the opposite side of a pedestrian area are more likely to receive such fines.

Measurement error, self-rating and the heritability of personality traits

While personality is probably not really that simple to summarize, most research on personality use some variant of the big five/OCEAN model (use this test). Using such measures, it has generally been found that the heritability of OCEAN traits is around 40%. Lots of room for environmental effects, surely. Unfortunately, most of the non-heritable variance is in the everything else-category.

But, these results are based on self-rated personality and not even corrected for random measurement error which is usually easy enough to do. So, suppose we correct for random measurement error, then perhaps we get to 50% heritability. This is because (almost?) any kind of measurement error biases heritability downwards.

What about self-rating bias? Surely there are some personality traits that cause people to systematically rate themselves different from how other people rate themselves, i.e. systematic measurement error. Even for height — a very simple trait — using self-reported height deflated heritability by about 4% compared with clinical measurement (from 91 to 87%), and clinical measurement is not free of random measurement error either. Furthermore, human height varies somewhat within a given day — a kind of systematic measurement error.

So, are other-ratings of personality better? There is a large meta-analysis showing that other-ratings are better. They have stronger correlations with actual criteria outcomes than self-ratings:

Other_rating_strangersother_rating_academic other_rating_workperf

This suggests considerable systematic measurement error in the self-ratings. The counter-hypothesis: others’ ratings of one’s personality, while not actually more accurate than self-ratings, causally influences the chosen outcomes, such that it appears that other-ratings are better. E.g. teachers/supervisors give higher grades/performance ratings to those they incorrectly judge to be more open minded due to some kind of halo effect. I don’t know of any research on this question.

Still, what do we find if we analyze the heritability of personality using other-ratings and especially the combination of self- and other-ratings? We get this:


A mean heritability of 81% for the OCEAN traits. Like the height study, there was evidence of heritable influence on systematic self-rating error (53% in this study, the height study found 36% but had limited precision).

Conclusion: measurement error and criminology

Back to criminology. We have seen that:
  1. Official records have serious problems with measuring the right construct (criminal harmful behavior), probably suffer from lots of random measurement error and probably some systematic measurement error.
  2. Self-ratings suffer from systematic measurement error.
  3. Measurement error biases estimates of heritability downwards.
We combine them and derive the conclusion: heritabilities of harmful criminal behavior are probably seriously underestimated.
Questions for future research:
  • Locate or do behavioral genetic studies of crime based on multiple methods and other-ratings. What do they show?
  • Find evidence to determine whether the higher validity of other ratings is due to their higher precision or due to causal halo effects.

March 19, 2016

Is sharing the chorion important? (No)

Filed under: Genetics / behavioral genetics — Tags: , , — Emil O. W. Kirkegaard @ 07:33

In the comments on this SlateStarCodex post, someone is questioning my claim that sharing the chorion (what is that?) is unimportant:


Uterine environment stuff is not important. We know this because of this study

and because ordinary siblings are about as similar as DZ twins who did share the uterus at the same time. They have the same genetic relatedness of .50 by descent.


Uterine environment stuff is not important. We know this because of this study

and because ordinary siblings are about as similar as DZ twins who did share the uterus at the same time. They have the same genetic relatedness of .50 by descent.

First, the paper that you linked doesn’t seem to be as strong as you claim. I quote from page 7: “Intra-uterine environmental factors do influence the intra-pair similarity of MCMZ and DCMZ pairs for birth weight, weight during the first years of life, achieving motor milestone standing alone, externalizing behaviors at age 3, internalizing behaviors and anxiety at age 12, and autistic behavior at age 7.” I’d be very reluctant to wave all of that away as unimportant.

The abstract claims that these results would only have “small” effects on heritability estimates, but the numbers brought up in the Discussion don’t sound that small to me: “For internalizing behavior at age 12 years, with a difference of 0.11, the heritability will be 50 % [2*(0.71–0.46)] when including both types of twins and 38 % [2*(0.65–0.46)] when including only DC twins. For anxiety at age 12, the heritability will be 70 % [2*(0.71–0.36)] when both types were included and 52 % [2*(0.62–0.36)] when only DCMCs were included.” Maybe these results are due to chance and as such these effects aren’t real, but we’d need to look at more studies to determine whether or not that is the case.

Regardless, even if we were to assume that research generally finds uterine environment to be unimportant, isn’t the (as far as I can tell, let me know if there is something that I missed) very well established Older Brother Effect – that is, having more biological older brothers makes a man more likely to be gay – a glaring exception to that? And if it is, doesn’t that raise the question of what other effects we might have missed?

The first point is to mistake the detection of an effect for an important effect. A large study can find unimportant effects and a study that looks for a lot of effects will find some with p<a (p-value less than alpha aka “significant”) by chance. It is important to focus on effect sizes, not p<a.

So, the proper strategy is to extract the data of the effects and analyze them. I did this by extracting them from the provided supplementary materials. I note that the authors were silly since they not only provided an oddly formatted table but also provided it in .DOCX instead of a spreadsheet. Anyway, I extracted the data in R and analyzed it. The analysis is simply calculating MC – DC. This is the difference between the relatedness of the persons by whether they shared the chorion or not. A null + sampling error model predicts a distribution of effects around 0 with a few outlying values. What do we see?


The mean is 0.00, tightly clustered around 0. In fact, we can go further. A null model + sampling error model also predicts that the larger effects in either direction should be the less precisely measured effects. So, we calculate the standard errors of the deltas and plot this against the absolute effect size.


We do see a large effect: the less precise estimates tend to be further away from 0. So, it really does seem that chorion effects are either nonexistent or very weak indeed. Of course, one could invoke the contextual defense: but it’s possible that sharing the chorion is important for some traits and unimportant for others, and you haven’t disproven that. This is the same defense usually adopted by authors whose pet hypothesized effects fail to replicate, so color me skeptical.

Data and code



Later on, in the comments section of the linked post, Scott says this:

The womb is counted as shared environment, since both MZ twins *and* DZ twins share a womb. The only case in which womb is not counted as shared environment is insofar as MZ twins have a more similar uterine environment than DZ twins, which I think happens in certain cases involving a structure called the chorion that my medical school professors would be horrified I don’t remember anything about. But I think most intrauterine issues are in shared environment.

While twins may be sharing an environment in a literal sense, this is not what the term “shared environment” means. When siblings share the experience of a divorce in the family, this too is usually not going to add to the “shared” (C) component of the normal ACE model. The distinction between shared and non-shared or unique (E) environment is subtle. Events are shared only to the extent that they make family members more similar. E is the residual variance left over after partitioning A and C; only part of this component is stable.

Scott may know this but he may still think that prenatal environments make twins more similar. In reality, the action of the chorion is almost never to increase twin resemblance, and therefore, it most often constitutes a non-shared effect. When Price (1950, 1978) reviewed this literature, he argued that nearly all prenatal effects caused MZ twins to be less similar. Subsequent research (recapitulated in Bouchard & McGue, 2002) reinforces this point. This fits with Wilson’s (1976, 1983, here; see also Loehlin, 2016) work on the concordance in growth parameters in MZ and DZ twins. The general pattern is such that MZ twins are less similar at birth than are DZ twins because the former more often share the same prenatal environment; despite this, the effects of the prenatal environment diminish and MZ twins become much more similar with age, a phenomenon dubbed the “Wilson effect” (Bouchard, 2013). Worth noting is that maternal influences may also be heritable (e.g., Lunde et al., 2007; Anum et al., 2009).

Nonetheless, some have argued that the shared experience of the prenatal environment explains a substantial part of MZ twin similarity. For IQ, Devlin, Daniels & Roeder (1997a) argued based on models fit to samples of adoptees – primarily children -, that prenatal influences explained twin similarity in excess of genetic effects. These authors estimated a broad-sense heritability of 47% (32% additive, 15% non-additive) and alleged that the maternal environment explained an additional 27% of the similarity for twins. This is similar to estimates from Loehlin (1989) and Chipuer, Rovine & Plomin (1990), the former estimating Hto be 47% or 58% depending on the model, and the latter estimating it at 51%.

It’s curious how Devlin et al. chalked up similarity to the prenatal environment when most of the published literature shows that prenatal effects are generally small (Munsinger, 1977; Jacobs et al., 2001; Loos et al., 2001; Marceau et al., 2016), usually work in the opposite direction, and seem to fade with age (while the heritability of IQ and many other traits increases; McCartney, Harris & Bernieri, 1990; McGue et al., 1993; Scarr, Weinberg & Waldman, 1993, cf. Loehlin, 2000, p. 185; Plomin et al., 1997; Polderman et al., 2006; Bergen, Gardner & Kendler, 2007; Segal et al., 2007; Haworth et al., 2010Briley & Tucker-Drob, 2013; Bouchard, 2013; Tucker-Drob & Briley, 2014). Presumably, if the prenatal environment mattered much for IQ in adulthood, there would be some lasting effect of prenatal interventions, but the evidence here is scant (e.g., Dulal et al., 2018; here). It isn’t even clear that things like malnutrition and prenatal cocaine exposure have an effect on general intelligence (Metzen, 2012). The assumption undergirding this model – that the prenatal environment explains twin similarity – seems to have just been a modeling choice, not an empirical observation or finding.

The other major issue with model fitting exercises like Loehlin’s, Devlin’s, or Chipuer’s, is that they neglect the abovementioned effects of age on heritability. McGue et al. (1993 p. 62) first speculated that Chipuer’s results were due to this sort of neglect; Bouchard & McGue (2002) showed it. These authors showed that studies from young samples, in which shared environmental effects are larger, swamped the data, leading to underwhelming heritability estimates. When properly analyzed by age, the heritability increases (and in some samples, it also decreases a bit in the elderly), while the shared environmental component of IQ becomes statistically or practically insignificant by adulthood. The impotence of the shared environment in adulthood is not due to representativeness issues, as researchers like Nancy Segal (above) and Teasdale & Owens (1984) have shown. Bouchard & McGue (p. 17) write:

In summary, twin, adoption, and longitudinal family studies of IQ all converge on the conclusion that genetic factors increase while shared environmental factors decrease in importance with age, at least until middle age. Summary estimates of heritability from Devlin et al. (1997a), Chipuer et al. (1990), and Loehlin (1989) all fail to take these age effects into account.

There does not appear to be much or good evidence that heritability estimates derived from twins or alternative methods – such as censuses (Schwabe, Janss & van den Berg, 2017), twin family studies (van Leeuwen, van den Berg & Boomsma, 2008), sibling IBD analysis (Visscher et al., 2006), comparisons with other degrees of kinship (Paul, 1980; Bouchard & McGue, 1981), or GCTA (Hill et al., 2018) – are substantially inflated by the effect of the prenatal environment.

February 16, 2016

Comments on Gwern’s “Embryo selection for intelligence”


Embryo selection an add-on to IVF [his summary]:

  1. harvest x eggs
  2. fertilize them and create x embryos
  3. culture the embryos to either cleavage (2-4 days) or blastocyst (5-6 days) stage; of them, y will still be alive & not grossly abnormal
  4. freeze the embryos
  5. optional: embryo selection using quality and PGS
  6. unfreeze & implant 1 embryo; if no embryos left, return to #1 or give up
  7. if no live birth, go to #6

Gwern asks the question: suppose the technology is ready for this, would the procedure be cost efficient given estimates about the value of cognitive ability? To estimate this is surprisingly difficult due to the lack of easily available information about the cost of IVF, freezing/thawing embryos and so on. Still, one can make some estimates. Gwern finds that currently it is probably not worth it in economic terms (purely in terms of cost benefit of economic gains).

GCTA/GREML results for cognitive ability

To estimate the likely IQ gains from ES, one must know the narrow heritability (h2n) and the predictive validity of genomic data for IQ. Estimates of h2n come from GCTA/GREML (GCTA is the software) papers. Gwern has done the first meta-analysis of such papers, as far as I know. Disturbingly, he finds that the papers using older subjects found lower, not higher h2. This is of course in contrast to results from familial studies (e.g. this paper).
Worse, from eyeballing his forest plot, it seems that the larger studies found smaller h2ns, indicating a small study effect. Often this means publication bias. Perhaps there are more GREML studies out there that found negligible h2n for cognitive ability but were not published.
Because Gwern shared his data and code, I quickly checked whether there was some evidence for publication bias. Here’s the forest plot sorted for effect size. (The plot is ugly because it uses base graphics. Someone did try to make a ggplot2 version, but it’s not too good yet.)
We can see that the less precise studies tend to find smaller effects. The correlation is .35 [CI95: -.24 to .76]. There is some between study variance I2 = 31%, so there are likely some moderators. In the moderator analysis, I included standard error, publication year, mean age, and twin status (did the study use twins or not?). Unfortunately, the output is given in natural units, not standardized units:
Mixed-Effects Model (k = 13; tau^2 estimator: REML)

tau^2 (estimated amount of residual heterogeneity):     0 (SE = 0.0026)
tau (square root of estimated tau^2 value):             0
I^2 (residual heterogeneity / unaccounted variability): 0.00%
H^2 (unaccounted variability / sampling variability):   1.00
R^2 (amount of heterogeneity accounted for):            100.00%

Test for Residual Heterogeneity: 
QE(df = 8) = 3.5513, p-val = 0.8952

Test of Moderators (coefficient(s) 2,3,4,5): 
QM(df = 4) = 10.4646, p-val = 0.0333

Model Results:
           estimate       se     zval    pval    ci.ub    
intrcpt    -50.2260  53.6941  -0.9354  0.3496  -155.4644  55.0124    
SE           0.6149   0.8020   0.7666  0.4433    -0.9571   2.1868    
Age.mean    -0.0036   0.0013  -2.8209  0.0048    -0.0061  -0.0011  **
Twin TRUE   -0.1524   0.0893  -1.7060  0.0880    -0.3274   0.0227   .
pub_year     0.0252   0.0267   0.9432  0.3456    -0.0271   0.0774    

Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Of the attempted moderators, age was the most useful. The correlation is in fact -.54 [CI95: -.84 to .02], so maybe. I included the publication year to check for decline effects. Still, when age was in the model, there was not much evidence that standard error had any effect, i.e. no good evidence for publication bias. The funnel plot looks like this:


It seems slightly asymmetric, but it could be a fluke. We will have to wait for more studies to see.

Genomic predictive validity in the near future for cognitive ability

Gwern digs up studies that report the R2 (variance explained) values for predicting case-level outcomes. Usually, studies fail in this regard as they only report the R2 value of the hits (that is, SNPs with p<alpha). Instead, they should report the R2 of the polygenic scores using all the SNPs (or some large fraction of them). There is considerable validity of the SNPs whose p<alpha. Note however, that one must be careful with overfitting, so preferably a validation sample should be used (or use standard within cross-validation methods).

Gwern finds that the values in the published studies are quite low, e.g. 2.5% in Rietveld et al (2013). I am more optimistic about the expected R2 values in the near future. This is because the standard GWASes do not use imputed data. However, using that drastically increases the h2n estimates. See the recent paper for height. I expect similar findings to happen for CA as well. As far as I know, there is nothing keeping researchers from using imputed data.

Also note that it is not R2 itself that matters in practice, but the R value (beta). To get half the predictive validity, one need only a forth of the variance explained; sqrt(.25) = .5.

Another problem is that standard GWASes use a poor method of finding hits. Genomic data is sparse (most predictors have betas of 0), so one should use sparse methods, i.e. lasso regression. Steve Hsu has written about it here: The reason lasso regression is not used currently is that it requires case-level data to be shared. Researchers don’t currently share case-level data, only summary data. Once again, science is held back by scientists’ (and funders’) unscientific behavior (data hiding).

Future prospects?

In general, tho, I agree with Gwern. Embryo selection for cognitive ability is currently not worth it. However, I expect (90%) that it will be in a few years (<5 years).

R code

The rewritten analysis code for meta-analysis of GCTA/GREML studies.

# libs --------------------------------------------------------------------
p_load(metafor, stringr, plyr, psych, kirkegaard)

# data --------------------------------------------------------------------
d = read.csv("data/GCTA_CA.csv")

#publication year
d$pub_year = str_extract(d$Study, "\\d+") %>% as.numeric()

d_std = std_df(d, exclude = "HSNP")

# analyses ----------------------------------------------------------------
d <- d[order(d$Age.mean), ]
d <- d[order(d$HSNP), ]

rem <- rma(yi=HSNP, sei=SE, data=d); rem

print(corr.test(cbind(d$HSNP, d$SE)), short = F)
print(corr.test(cbind(d$HSNP, d$Age.mean)), short = F)

remAge <- rma(yi=HSNP, sei=SE, mods = ~ Age.mean, data=d); remAge
remAgeT <- rma(yi=HSNP, sei=SE, mods = ~ Age.mean + Twin, data=d); remAgeT
remES <- rma(yi=HSNP, sei=SE, mods = ~ SE, data=d); remES
remAll <- rma(yi=HSNP, sei=SE, mods = ~ SE + Age.mean + Twin + pub_year, data=d); remAll

forest(rma(yi=HSNP, sei=SE, data=d), slab=d$Study)
# GG_forest(rem) + xlim(c(0, 1))
funnel(rma(yi=HSNP, sei=SE, data=d), slab=d$Study)

The datafile:

"Deary et al 2012",0.48,0.18,11," FALSE"
"Deary et al 2012",0.28,0.18,71.3," FALSE"
"Plomin et al 2013",0.35,0.117,12," TRUE"
"Benyamin et al 2013",0.22,0.1,12," TRUE"
"Benyamin et al 2013",0.4,0.21,14," TRUE"
"Benyamin et al 2013",0.46,0.06,9," FALSE"
"Rietveld et al 2013",0.224,0.042,57.47," FALSE"
"Marioni et al 2014",0.29,0.05,57," FALSE"
"Kirkpatrick et al 2014",0.35,0.11,14.63," FALSE"
"Trzaskowski et al 2014",0.26,0.17,7," TRUE"
"Trzaskowski et al 2014",0.45,0.14,12," TRUE"
"Davies et al 2015",0.29,0.05,57.2," FALSE"
"Davies et al 2015",0.28,0.07,70," FALSE"

September 23, 2013

Review: Making sense of heritability

Filed under: Differential psychology/psychometrics,Education,Feminism/equality,Science,Sociology — Tags: , — Emil O. W. Kirkegaard @ 13:26



This is a GREAT book, which goes down to the basics about heritability and the various claims people have made against it. Highly recommended. Best book of the 29 i have read this year.


The denial of genetically based psychological differences is the kind of sophisti-

cated error normally accessible only to persons having Ph.D. degrees.

David Lykken


Quote checks out.



I was introduced to the nature–nurture debate by reading Ned Block

and Gerald Dworkin’s well-known and widely cited anthology about

the IQ controversy (Block & Dworkin 1976a). This collection of arti-

cles has long been the main source of information about the heredity–

environment problem for a great number of scientists, philosophers, and

other academics. It is not an exaggeration to say that the book has been

the major influence on thinking about this question for many years. Like

most readers, I also left the book with a feeling that hereditarianism (the

view that IQ differences among individuals or groups are in substantial

part due to genetic differences) is facing insuperable objections that strike

at its very core.


There was something very satisfying, especially to philosophers, about

the way hereditarianism was criticized there. A strong emphasis was on

conceptual and methodological difficulties, and the central arguments

against hereditarianism appeared to have full destructive force indepen-

dently of empirical data, which are, as we know, both difficult to evaluate

and inherently unpredictable.


So this looked like a philosopher’s dream come true: a scientific issue

with potentially dangerous political implications was defused not through

an arduous exploration of themessy empiricalmaterial but by using a dis-

tinctly philosophical method of conceptual analysis and methodological

criticism. It was especially gratifying that the undermined position was

often associated with politically unacceptable views like racism, toler-

ation of social injustice, etc. Besides, the defeat of that doctrine had a

certain air of finality. It seemed to be the result of very general, a priori

considerations, which, if correct, could not be reversed by “unpleasant”

discoveries in the future.


But very soon I started having second thoughts about Block and

Dworkin’s collection. The reasons are worth explaining in some detail

I think, because the book is still having a considerable impact, especially

on discussions in philosophy of science.


First, some of the arguments against hereditarianism presented there

were just too successful. The refutations looked so utterly simple, elegant,

and conclusive that it made me wonder whether competent scientists

could have really defended a position that was somanifestly indefensible.

Something was very odd about the whole situation.



There is indeed something about this. This book is a premier case of what Weinberg called mentioned with his comment “…a knowledge of philosophy does not seem to be of use to physicists – always with the exception that the work of some philosophers helps us to avoid the errors of other philosophers.”





Of course,Bouchardwould be justified in notworrying toomuch about

these global methodological criticisms if the only people who made a

fuss over them were philosophers of science. Even with this unfriendly

stance becoming a consensus in philosophy of science, scientists might

still remain unimpressed because many of them would probably be sym-

pathetic to JamesWatson’s claim: “I do not like to suffer at all from what

I call the German disease, an interest in philosophy” (Watson 1986: 19).


Source is: Watson, J. D. 1986, “Biology: A Necessarily Limitless Vista,” in S. Rose and L.

Appignanesi (eds.), Science and Beyond, Oxford, Blackwell.



At this point I am afraid I may lose some of my scientific readers.

Remembering Steven Weinberg’s statement that the insights of philoso-

phers have occasionally benefited scientists, “but generally in a negative

fashion – by protecting them from the preconceptions of other philoso-

phers” (Weinberg 1993: 107), they might conclude that it is best just to

avoid reading any philosophy (including this book), and that in this way

they will neither contract preconceptions nor need protection fromthem.

But the problemis that the preconceptions discussed here do not originate

from a philosophical armchair. Scientists should be aware that to a great

extent these preconceptions come from some of their own. Philosophers

of science uncritically accepted these seductive but ultimately fallacious

arguments from scientists, repackaged them a little, and then fed them

back to the scientific community, which often took them very seriously.

Bad science was mistaken for good philosophy.


Sesardic clearly saw the same connection to Weinberg’s comments as i did. :)



It may seem surprising that Jones dismissed the views of the founder

of his own laboratory (Galton Laboratory, University College London)

in such amanner. But then again this should perhaps not be so surprising.

One can hardly be expected to study seriously the work of a man whom

one happens to call publicly “Victorian racist swine” – the way Jones

referred to Galton in an interview (Grove 1991). Also, in Jones’s book

Genetics for Beginners (Jones & Van Loon 1993: 169), Galton is pictured

in a Nazi uniform, with a swastika on his sleeve.


The virulent antinazism among these lefties is extraordinary. It targets everybody having the least to do with ideas the nazis also liked. It is a wonder no one attacks vegetarians or people who campaign against smoking for being nazis…



Arthur Jensen once said that “a heritability study may be regarded

as a Geiger counter with which one scans the territory in order to find

the spot one can most profitably begin to dig for ore” (Jensen 1972b:

243). That Jensen’s advice as to how to look upon heritability is merely

an application of a standard general procedure in causal reasoning is

confirmed by the following observation from an introduction to causal

analysis: “the decomposition of statistical associations represents a first

step. The results indicate which effects are important and which may be

safely ignored, that is, where we ought to start digging in order to uncover

the nature of the causal mechanisms producing association between our

variables” (Hellevik 1984: 149). High heritability of a trait (in a given

population) often signals that it may be worthwhile to dig further, in the

sense that an important geneticmechanismcontrolling differences in this

trait may thus be uncovered.8


Another great Jensen insight.


Citation is to: 1972b, “Discussion,” in L. Ehrman, G. S. Omenn, E. Caspari (eds.), Genetics,

Environment and Behavior, New York, Academic Press.



Second, even if a trait is shared by all organisms in a given population

it can still be heritable – if we take a broader perspective, and compare

that populationwith other populations. The critics of heritability are often

confused, and switch from one perspective to another without noticing it.

Consider the following “problem” for heritability:


the heritability of “walking on two legs” is zero.And yetwalking on two legs

is clearly a fundamental property of being human, and is one of the more

obvious biological differences between humans and other great apes such

as chimpanzees or gorillas. It obviously depends heavily on genes, despite

having a heritability of zero. (Bateson 2001b: 565; cf. Bateson 2001a: 150–

151; 2002: 2212)


When Bateson speaks about the differences between humans and other

great apes, the heritability of walking on two legs in that population

(consisting of humans, chimpanzees, and gorillas) is certainly not zero.

On the other hand, within the human species itself the heritability may

well be zero. So, if it is just made entirely clear which population is

being discussed, no puzzling element remains. In the narrower popula-

tion (humans), the question “Do genetic differences explain why some

people walk on two legs and some don’t?” has a negative answer because

there are no such genetic differences. In the broader population (humans,

chimpanzees, and gorillas) the question “Do genetic differences explain

why some organisms walk on two legs and some don’t?” has an affirma-

tive answer. All this neatly accords with the logic of heritability, and cre-

ates no problem whatsoever. The critics of hereditarianism like to repeat

that heritability is a population-relative statistic, but when they raise this

kind of objection it seems that they themselves forget this important



Things like the number of finger is also heritable within populations. There are rare genetic mutations that cause supernumerary body parts:


However, these are very rare, so to spot them, one needs a huge sample size. Surely the heritability of having 6 fingers is high, while the heritability of having 4 fingers is low, but not zero. Of the people who have 4 fingers, most of the casesare probably caused by unique environment (i.e. accidents), but some are caused by genetics.



(4) It is often said that in individual cases it is meaningless to compare

the importance of interacting causes: “If an event is the result of the joint

operation of a number of causative chains and if these causes ‘interact’

in any generally accepted meaning of the word, it becomes conceptually

impossible to assign quantitative values to the causes of that individual

event” (Lewontin 1976a: 181).But this is in fact not true.Take, for example,

the rectangle with width 2 and length 1 (from Figure 2.3). Its area is 2,

which is considerably below the average area for all rectangles (around

100). Why is that particular rectangle smaller than most others? Is its

width or its length more responsible for that? Actually, this question is

not absurd at all. It has a straightforward and perfectlymeaningful answer.

The rectangleswith thatwidth (2) have on average the area that is identical

to the mean area for all rectangles (100.66), so the explanation why the

area of that particular rectangle deviates so much from the mean value

cannot be in its width. It is its below-average length that is responsible.


Even the usually cautious David Lykken slips here by condemning

the measurement of causal influences in the individual case as inherently

absurd: “It is meaningless to ask whether Isaac Newton’s genius was due

more to his genes or his environment, as meaningless as asking whether

the area of a rectangle is due more to its length or its width” (Lykken

1998a: 24). Contrary to what he says, however, it makes perfect sense to

inquire whether Newton’s extraordinary contributions were more due to

his above-average inherited intellectual ability or to his being exposed

to an above-average stimulating intellectual environment (or to some

particular combination of the two). The Nuffield Council on Bioethics

makes a similar mistake in its report on genetics and human behavior:

“It is vital to understand that neither concept of heritability [broad or

narrow] allows us to conclude anything about the role of heredity in the

development of a characteristic in an individual” (Nuffield 2002: 40). On

the contrary, if the broad heritability of a trait is high, this does tell us

that any individual’s phenotypic divergence from the mean is probably

more caused by a non-standard genetic influence than by a non-typical

environment. For a characteristically clear explanation of why gauging

the contributions of heredity and environment is not meaningless even in

an individual case, see Sober 1994: 190–192.


This is a good point. The reason not to talk about the causes of a particular level of g in some person is not that it is a meaningless question, it is that it is difficult to know the answer. But in some cases, it is clearly possible, cf. my number of fingers scenario above.



Nesardic mentions two studies that fysical attractiveness is not correlated with intelligence. That goes against what i believe(d?). He cites:


Feingold, A. 1992, “Good-looking People Are NotWhatWe Think,” Psycholog-

ical Bulletin 111: 304–341.


Langlois, J. H., Kalakanis, L., Rubenstein, A. J., Larson, A., Hallam, M., and

Smoot, M. 2000, “Maxims or Myths of Beauty? A Meta-Analytic and Theo-

retical Review,” Psychological Bulletin 126: 390–423.


But i apparently dont have access to the first one. But the second one i do have. In it one can read:


According to this maxim, there is no necessary correspondence

between external appearance and the behavior or personality of an

individual (Ammer, 1992). Two meta-analyses have examined the

relation between attractiveness and some behaviors and traits

(Feingold, 1992b2; L. A. Jackson, Hunter, & Hodge, 1995). Fein-

gold (1992b) reported significant relations between attractiveness

and measures of mental health, social anxiety, popularity, and

sexual activity but nonsignificant relations between attractiveness

and sociability, internal locus of control, freedom from self-

absorption and manipulativeness, and sexual permissiveness in

adults. Feingold also found a nonsignificant relation between at-

tractiveness and intelligence (r = .04) for adults, whereas L. A.

Jackson et al. found a significant relation for both adults (d = .24

overall, d = .02 once selected studies were removed) and for

children (d = .41).


These meta-analyses suggest that there may be a relation be-

twe^n behavior and attractiveness, but the inconsistencies in re-

sults call for additional attention. Moreover, the vast majority of

dependent variables analyzed by Feingold (1992b) and L. A.

Jackson et al. (1995) assessed traits as defined by psychometric

tests (e.g., IQ) rather than behavior as defined by observations of

behaviors in actual interactions. Thus, to fully understand the

relations among appearance, behaviors, and traits, it is important to

broaden the conception of behavior beyond that used by Feingold

and L. A. Jackson et al. If beauty is only skin-deep, then a

comprehensive meta-analysis of the literature should find no sig-

nificant differences between attractive and unattractive people in

their behaviors, traits, or self-views.


So, maybe. It seems difficult that g and pa (phy. attract.) is NOT associated purely by effect of mating choices, since females prefer males with high SES and males prefer females with have pa. Then comes the mutational load hypothesis, and the fact that smarter people presumably are better at taking care of their bodies, which increases pa. I find it very difficult indeed to believe that they arent correlated.



In my opinion, this kind of deliberate misrepresentation in attacks on

hereditarianism is less frequent than sheer ignorance. But why is it that a

number of peoplewho publicly attack “Jensenism” are so poorly informed

about Jensen’s real views? Given the magnitude of their distortions and

the ease with which these misinterpretations spread, one is alerted to

the possibility that at least some of these anti-hereditarians did not get

their information about hereditarianismfirst hand, fromprimary sources,

but only indirectly, from the texts of unsympathetic and sometimes quite

biased critics.8In this connection, it is interesting to note that several

authors who strongly disagree with Jensen (Longino 1990; Bowler 1989;

Allen 1990; Billings et al. 1992; McInerney 1996; Beckwith 1993; Kassim

2002) refer to his classic paper from 1969 by citing the volume of the

Harvard Educational Review incorrectly as “33” (instead of “39”). What

makes this mis-citation noteworthy is that the very same mistake is to

be found in Gould’s Mismeasure of Man (in both editions). Now the

fact that Gould’s idiosyncratic lapsus calami gets repeated in the later

sources is either an extremely unlikely coincidence or else it reveals that

these authors’ references to Jensen’s paper actually originate from their

contact with Gould’s text, not Jensen’s.


Gotcha. A nice illustrating case of the thing map makers used to use to prove plagiarism.


Incidentally, in this case it ended up having another use! :)



Nesardic quotes:


In December 1986 our newly-born daughter was diagnosed to be suffering

from a genetically caused disease called Dystrophic Epidermolysis Bullosa

(EB). This is a disease in which the skin of the sufferer is lacking in certain

essential fibers. As a result, any contact with her skin caused large blisters

to form, which subsequently burst leaving raw open skin that healed only

slowly and left terrible scarring. As EB is a genetically caused disease it

is incurable and the form that our daughter suffered from usually causes

death within the first sixmonths of life . . .Our daughter died after a painful

and short life at the age of only 12 weeks. (quoted in Glover 2001: 431 –

italics added)


from: Glover, J. 2001, “Future People, Disability, and Screening,” in J. Harris (ed.),

Bioethics, Oxford, Oxford University Press.


Nasty disease indeed. Only eugenics can avoid such atrocities.



On the contrary, empirical evidence suggests that for many important

psychological traits (particularly IQ), the environmental influences that

account for phenotypic variation among adults largely belong to the non-

shared variety. In particular, adoption studies of genetically unrelated

children raised in the same family show that for many traits the adult

phenotypic correlation among these children is very close to zero (Plomin

et al. 2001: 299–300). This very surprising but consistent result points

to the conclusion that we may have greatly overestimated the impact

of variation in shared environmental influences.6The fact that variation

within a normal range does not have much effect was dramatized in the

following way by neuroscientist Steve Petersen:


At a minimum, development really wants to happen. It takes very impov-

erished environments to interfere with development because the biological

system has evolved so that the environment alone stimulates development.

What does this mean? Don’t raise your children in a closet, starve them, or

hit them in the head with a frying pan. (Quoted in Bruer 1999: 188)


But if social reforms are mainly directed at eliminating precisely these

between-family inequalities (economic, social, and educational), and if

these differences are not so consequential as we thought, then egalitar-

ianism will find a point of resistance not just in genes but also in the

non-heritable domain, i.e., in those uncontrollable and chaotically emerg-

ing environmental differences that by their very nature cannot be an easy

object for social manipulation.


All this shows that it is irresponsible to disregard constraints on mal-

leability and fan false hopes about what social or educational reforms can

do. As David Rowe said:


As social scientists, we should be wary of promisingmore than we are likely

to deliver. Physicists do not greet every new perpetual motion machine,

created by a basement inventor, with shouts of joy and claims of an endless

source of electrical or mechanical power; no, they know the laws of physics

would prevent it. (Rowe 1997: 154)


I will end this chapter with another qualification.Although heritability

puts constraints on malleability it is, strictly speaking, incorrect to say

that the heritable part of phenotypic variance cannot be decreased by

environmentalmanipulation. It is true that if heritability is, say, 80 percent

then at most 20 percent of the variation can be eliminated by equalizing

environments. But if we consider redistributing environments, without

necessarily equalizing them, a larger portion of variance than 20 percent

can be removed.


Table 5.5 gives an illustration how this might work.

In this examplewith just two genotypes and two environments (equally

distributed in the population), themain effect of the genotype on the vari-

ation in the trait (say, IQ) is obviously stronger than the environmental

effect. Going from G2 to G1 increases IQ 20 points, while going from the

less favorable environment (E2) to the more favorable one (E1) leads

to an increase of only 10 points. Heritability is 80 percent, the genetic

variance being 100 and the environmental variance being 25. Now if we

expose everyone to the more favorable environment (E1) we will com-

pletely remove the environmental variance (25), and the variance in the

new population will be 100. The genetic variance survives environmental

manipulation unscathed.




But there is a way to make an incursion into the “genetic territory.”

Suppose we expose all those endowed with G1 to the less favorable

environment (E2) and those with G2 to the more favorable environment

(E1). In this way we would get rid of the highest and lowest score, and

we would be left only with scores of 95 and 105. In terms of variance, we

would have succeeded in eliminating 80 percent of variance by manipu-

lating environment, despite heritability being 80 percent.


How is this possible? The answer is in the formula for calculating vari-

ance in chapter 1 (see p. 21). One component of variance is genotype–

environment correlation, which can have a negative numerical value.

This is what has happened in our example. The phenotype-increasing

genotype was paired with the phenotype-decreasing environment, and

the phenotype-decreasing genotype was paired with the phenotype-

increasing environment. This move introduced the negative G–E corre-

lation and neutralized the main effects, bringing about a drastic drop in



The strategy calls to mind the famous Kurt Vonnegut story “Harrison

Bergeron,” where the society intervenes very early and suppresses the

mere expression of superior innate abilities by imposing artificial obsta-

cles on gifted individuals. Here is just one short passage from Vonnegut:


And George, while his intelligence was way above normal, had a little

mental-handicap radio in his ear – he was required by law to wear it at all

times. It was tuned to a government transmitter and, every twenty seconds

or so, the transmitter would send out some sharp noise to keep people like

George from taking unfair advantage of their brains. (Vonnegut 1970: 7)


We all get a chill from the nightmare world of “Harrison Bergeron.” But

in its milder forms the idea that if the less talented cannot be brought

up to the level of those better endowed, the latter should then be held

back in their development for the sake of equality, is not entirely with-

out adherents. In one of the most carefully argued sociological studies

on inequality there is an interesting proposal in that direction, about

how to reduce differences in cognitive abilities that are caused by genetic



Asociety committed to achieving full cognitive equality would, for example,

probably have to exclude genetically advantaged children from school. It

might also have to impose other handicaps on them, like denying them

access to books and television. Virtually no one thinks cognitive equality

worth such a price.Certainlywe do not.But if our goalwere simply to reduce

cognitive inequality to, say, half its present level, instead of eliminating it

entirely, the pricemight bemuch lower. (Jencks et al. 1972: 75–76 – emphasis



So although Jencks and his associates concede that excluding geneti-

cally advantaged children from school and denying them access to books

may be too drastic, they appear to think that the price of equality could

become acceptable if the goalwas lowered andmeasuresmademoremod-

erate. Are they suggesting that George keeps the little mental-handicap

radio in his ear but that the noise volume should be set only at half



I wonder if someone cud make a good video based on this… Oh that’s right…



David Lykken had a good comment on this tendency of some

Darwinians (he had John Tooby and Leda Cosmides in mind) to pub-

licly dissociate themselves from behavior genetics, in the hope that this

move would make their own research less vulnerable to political criti-

cisms: “Are these folks just being politic, just claiming only the minimum

they need to pursue their own agenda while leaving the behavior geneti-

cists to contend with the main armies of political correctness?” (Lykken



There are some obvious, and other less obvious, consequences of polit-

ically inspired, vituperative attacks on a given hypothesisH.On the obvi-

ous side, many scientists who believe that H is true will be reluctant to

say so, many will publicly condemn it in order to eliminate suspicion that

they might support it, anonymous polls of scientists’ opinions will give

a different picture from the most vocal and most frequent public pro-

nouncements (Snyderman & Rothman 1988), it will be difficult to get

funding for research on “sensitive” topics,19the whole research area will

be avoided by many because one could not be sure to end up with the

“right” conclusion,20texts insufficiently critical of “condemned” views

will not be accepted for publication,21etc.


On the less obvious side, a nasty campaign against H could have the

unintended effect of strengthening H epistemically, and making the criti-

cism of H look less convincing. Simply, if you happen to believe that H is

true and if you also know that opponents of H will be strongly tempted

to “play dirty,” that they will be eager to seize upon your smallest mis-

take, blow it out of all proportion, and label you with Dennett’s “good

epithets,” with a number of personal attacks thrown in for good measure,

then if you still want to advocate H, you will surely take extreme care to

present your argument in the strongest possible form. In the inhospitable

environment for your views, you will be aware that any major error is a

liability that you can hardly afford, because it willmore likely be regarded

as a reflection of your sinister political intentions than as a sign of your

fallibility. The last thing onewants in this situation is the disastrous combi-

nation of being politically denounced (say, as a “racist”) and being proved

to be seriously wrong about science. Therefore, in the attempt to make

themselves as little vulnerable as possible to attacks they can expect from

their uncharitable and strident critics, those who defendHwill tread very

cautiously and try to build a very solid case before committing themselves

publicly. As a result, the quality of their argument will tend to rise, if the

subject matter allows it.22


Interesting effects of the unpopularity of the views.



First of all, the issue about heritability is obviously a purely empirical

and factual one. So there is a strong case for denying that it can affect

our normative beliefs. But it is worth noting that the idea that a certain

heritability value could have political implications was not only criticized

for violatingHume’s law, but also for being politically dangerous. Bluntly,

if the high heritability of IQ differences between races really has racist

implications then it would seem that, after all, science could actually dis-

cover that racism is true.


The dangerwas clearly recognized byDavidHorowitz in his comments

on a statement on race that the Genetics Society of America (GSA)

wanted to issue in 1975. A committee preparing the statement took the

line that racism is best fought by demonstrating that racists’ belief in the

heritability of the black–white difference in IQ is disproved by science.

Horowitz objected:


The proposed statement is weak morally, for the following reason: Racists

assert that blacks are genetically inferior in I.Q. and therefore need not

be treated as equals. The proposed statement disputes the premise of the

assertion, but not the logic of the conclusion. It does not perceive that the

premise, while it may be mistaken, is not by itself racist: it is the conclusion

drawn (wrongly) from it that is racist. Even if the premise were correct, the

conclusion would not be justified …Yetthe proposed statement directs its

main fire at the premise, and by so doing seems to accept the racist logic.

It places itself in a morally vulnerable position, for if, at some future time,

that the premise is correct, then the whole GSA case collapses, together

with its case for equal opportunity. (Quoted in Provine 1986: 880)


The same argument was made by others:


To rest the case for equal treatment of national or racial minorities on

the assertion that they do not differ from other men is implicitly to admit

that factual inequality would justify unequal treatment. (Hayek 1960:


But to fear research on genetic racial differences, or the possible existence

of a biological basis for differences in abilities, is, in a sense, to grant the

racist’s assumption: that if it should be established beyond reasonable doubt

that there are biological or genetically conditioned differences in mental

abilities among individuals or groups, then we are justified in oppressing

or exploiting those who are most limited in genetic endowment. This is, of

course, a complete non sequitur. (Jensen 1972a: 329)

If someone defends racial discrimination on the grounds of genetic differ-

ences between races, it is more prudent to attack the logic of his argument

than to accept the argument and deny any differences. The latter stance can

leave one in an extremely awkward position if such a difference is subse-

quently shown to exist. (Loehlin et al. 1975: 240)

But it is a dangerousmistake to premise themoral equality of human beings

on biological similarity because dissimilarity, once revealed, then becomes

an argument for moral inequality. (Edwards 2003: 801)


Good point indeed.

July 31, 2013

Review: The Nurture Assumption (Judith Harris)

Filed under: Psychology,Sociology — Tags: , — Emil O. W. Kirkegaard @ 10:03

This book turned out to be not what i had expected, but still interesting. Not sure why it got all the bad press. It’s behavior realistic but focuses on the environment which is what the author finds interesting. I think genetics is more interesting, but this is interesting too.
The Nurture Assumption Why Children Turn Out the Way They Do, Revised and Updated Judith Rich Harris

Donald of the Apes

Donald was ten months old, and Gua seven and a half months, when she

came to live with the Kelloggs in 1931. Right from the start she was treated

like a human baby—that is, the way human babies were treated in the 1930s.

The Kelloggs put clothes on her, and the stiff shoes that babies wore in those

days. She wasn’t caged or tied up, which meant that she had to be watched

every second except when she was asleep (but then, the same was true of

Donald). She was potty trained. Her teeth were brushed. She was fed the same

foods as Donald and had the same naptimes and bathtimes. There is a photo­

graph in the Kelloggs’ book of Gua and Donald sitting side by side, dressed

identically in footed pajamas of the kind my mother used to call “Dr. Den-

tons.” Donald is frowning; Gua’s lips are curved upward in a modest smile.

They are holding hands.

Aside from the difference in temperament recorded in that revealing photo,

the two were remarkably well matched. Chimpanzees develop more rapidly in

infancy than humans, but Donald was two and a half months older and that

helped to even things up. They played together like siblings, chasing each

other around the furniture, roughhousing and giggling. Donald had a walker,

a big heavy thing, and one of his favorite sports, according to his parents, was

“to rush at the ape in this rumbling Juggernaut and laugh as she scurried to

keep from being run over, often without success.” But Gua didn’t hold grudges

and she enjoyed rough-and-tumble play. In fact, the two got along better

than most siblings. I f one o f them cried, the other would offer pats or hugs of

consolation. If Gua got up from her nap before Donald, she “could hardly be

kept from the door o f his room.” 1

Gua was more fun than a barrel full of Donalds. When the Kelloggs tickled

her or swung her around, she would laugh just like a human baby. If they tried

swinging Donald, he would cry. Gua was more affectionate (expressing her

affection with hugs and kisses) and more cooperative. While being dressed, the

ape—but not the boy—would push her arms into open sleeves and bend her

head to allow her bib to be tied on. If she did something wrong and was

scolded for it, she would utter plaintive “00-00” cries and throw herself into

the scolder’s arms, offering a “kiss of reconciliation” and uttering an audible

sigh of relief when she was permitted to bestow it.

In mastering the challenges of civilized life, Gua often caught on a little

faster than the stolid Donald. She was ahead in obeying spoken commands,

learning to feed herself with a spoon, and giving a warning signal when she

needed to use the potty (unfortunately, though, her potty training never

became completely reliable). The ape equaled or exceeded the child in most of

the tests that Dr. Kellogg devised: she was as adept as Donald at figuring out

how to use a hoe-shaped implement to pull a piece of apple toward her, and

learned more quickly to use a chair to reach a cookie suspended from the ceil­

ing. When the chair was moved to a new starting point, so that it had to be

pushed in a different direction to reach the cookie, Donald continued to

push it in the same direction as before, whereas Gua kept her eye on the

cookie and claimed the prize.2

There was one thing, however, in which the boy was clearly superior: Don­

ald was the better imitator. Does that surprise you? According to Frans de

Waal, a Dutch primatologist who spent several years observing the chim­

panzees and their human visitors at a Netherlands zoo, “Contrary to general

belief, humans imitate apes more than the reverse.”3

This was clearly the case with Donald and Gua. “It was Gua, in fact, who

was almost always the aggressor or leader in finding new toys to play with and

new methods o f play; while the human was inclined to take up the role of the

imitator or follower.”4 Thus, Donald picked up Gua’s annoying habit of biting

the wall. He also picked up a fair amount of chimpanzee language—the food

bark, for instance. How did Luella Kellogg feel, I wonder, when her fourteen-

month-old son ran to her with an orange in his hands, grunting “uhuh, uhuh,


The average American child can produce more than fifty words at nineteen

months5 and is starting to put them together to form phrases. At nineteen

months, Donald could speak only three English words.* At this point the

experiment was terminated and Gua went back to the zoo.

The Kelloggs had tried to train an ape to be a human. Instead, it seemed

that Gua was training their son to be an ape. Their experiment tells us more

about human nature than about the nature of the chimpanzee, but it also tells

us that there is remarkably little difference between them—at least in the first

nineteen months. In this chapter I will look at some of the differences between

chimpanzee nature and human nature that appear after the age of nineteen

months, and at some o f the similarities that remain.

One of the things that characterize these exceptional classrooms is the atti­

tude the students adopt toward the slower learners among them. Instead of

making fun of them, they cheer them on. There was a boy with reading prob­

lems in one of Rodriguez’s classes and when he started making progress the

whole class celebrated: “Every time he made a small step, the class would give

him a round of applause.”

[04:20:33] Emil – Deleet: using this effect was something Khan suggested

[04:20:37] Emil – Deleet: Khan from Khan Academy

[04:20:47] Emil – Deleet: to get the smarter kids to help the less smart

[04:20:57] Emil – Deleet: he suggested whole class achievements

[04:21:12] Emil – Deleet: so that the entire group benefits when everybody masters something

[04:21:20] Emil – Deleet: creating incentives for the smarter to help the others

[04:21:50] Emil – Deleet: teaching something also helps the teacher master it better, so both parties benefit

[04:21:51] Emil – Deleet: in theory

i would very much like to see experiments with this.

A well-dressed man often sports nothing more than a string around his waist to

which is tied the stretched-out foreskin of his penis. As a young boy matures, he

starts to act masculine by tying his penis to his waist string, and the Yanomamo

use this developmental phase to signify a boy’s age: “My son is now tying up his

penis.” A certain amount of teasing takes place at that age, since an inexperi­

enced youth will have trouble controlling his penis. It takes a while for the fore­

skin to stretch to the length required to keep it tied securely, and until then it is

likely to slip out of the string, much to the embarrassment of its owner and the

mirth of older boys and men.

In societies where education is compulsory, children rank “being left back

in school” as the third most scary thing they can think of, beaten out only by

“losing a parent” and “going blind.” “Wetting my pants in school” comes in

fourth.4 A Yanomamo boy with his penis not tied up is like an American child

who has wet his pants in school: he is a boy who has been left back. It would

be humiliating to walk around with a dangling penis when other boys his age

or younger were already tying theirs up. When the Yanomamo boy ties his

foreskin to the string around his waist, he’s not pretending to be his father: he’s

concerned about maintaining his status among the other children in the vil­

lage. It is the mirth of the older boys that provides the stick. It is the respect of

the younger ones that provides the carrot.

Then the mother, with the other women, accompanies her daughter into the

woods to adorn her.. . . One woman begins to rub a little red urucu over all her

body, which becomes pink. They then design wavy black lines, brown on her

face and body; they make lovely designs. When she is completely painted, they

push through the large hole in her ear those strips of young assai

leaves. . . . Then they take coloured feathers and push them through the holes

which they have at the corners of their mouths and in the middle of the lower

lip. One woman also prepares a long, thin, white stick, very smooth, which she

puts in the hole that they have between their nostrils. The young girl is really

lovely, painted and decorated like this! The women say: “Now let’s go.” The girl

walks ahead, and after her come the other women and the little girls.6

The parade wends slowly through the center of the village so that everyone

can admire the debutante. Though she is probably no more than fifteen years

old (menarche comes later to girls in tribal societies), she is now considered old

enough to marry. I f her father has already promised her to someone she will

take up residence with her new husband. She went into the cage a girl and

came out a woman, as though a magician had waved his magic wand: Poof,

you’re a woman!


They are supposed to behave that way in some societies. Yanomamo men,

if they don’t like the way their wife is behaving, hit her with a stick or shoot an

arrow into some part of her anatomy they can do without. Ask Helena, the

Brazilian girl who was kidnapped by the Yanomamo. When Helena came of

age she was claimed by a Yanomamo headman, Fusiwe, who already had four

wives. Fusiwe was a nice guy by Yanomamo standards—reader, she loved

him!—but he got angry at her once for something that wasn’t her fault and he

broke her arm.

According to the editorial in the Journal o f the American Medical Association,

Carl McElhinney was a child murderer. No, not a murderer of children, but a

seven-year-old boy who had committed a murder. The editorial was written in

1896; it was reprinted in JAMA a hundred years later as a historical curiosity.

I cannot give you any details of Carl’s crime because the focus of the edito­

rial was not on the murderer himself but on his mother.

Before Carl’s birth Mrs. McElhinney was an assiduous reader of novels. Morn­

ing, noon and night her mind was preoccupied with imaginative crimes of the

most bloody sort. Being a woman of fine and delicate perception, she appreci­

ated to an extent almost equaling reality the extravagant miseries, motive, vil­

lainies set down in novels, so that her mind was miserably contorted weeks

before the birth of her child Carl. The boy was an abnormal development of

criminality. He has a delight in the inhuman. It takes intense horror to please

this peculiar appetite. . . . I believe criminal record does not show a case so

remarkable as this. As the boy matures these mental conditions will mature. He

is dangerous to the community.

The cause of Carl’s abnormal development, according to the physician

who wrote the editorial, was the impression made on his mother’s mind by the

books she read while she was carrying him. Strong impressions on a woman’s

mind “may pervert or stop the growth, or cause defect in the child with which

she is pregnant.”

The editorial concluded, as editorials are wont to do, with a moral:

We as scientific physicians . . . should teach our patrons how to care for our

pregnant women, and the danger from maternal influences. The Spartans bred

warriors, and I believe this generation can breed a better people. One of the

future advances to help the generations to come, will be to teach them the

power of maternal influences, with better care of our pregnant women.1

The “better care of our pregnant women” would presumably include care­

ful screening of the reading material permitted to them.

Not so fast. It turns out that the ability of a criminal adoptive family to pro­

duce a criminal child—given suitable material to work with—depends on

where the family happens to live. The increase in criminality among Danish

adoptees reared in criminal homes was found only for a minority of the sub­

jects in this study: those who grew up in or around Copenhagen. In small

towns and rural areas, an adoptee reared in a criminal home was no more

likely to become a criminal than one reared by honest adoptive parents.14

It wasn’t the criminal adoptive parents who made the biological son of

criminals into a criminal: it was the neighborhood in which they reared him.

Neighborhoods differ in rates of criminal behavior, and I would guess that

neighborhoods with high rates of criminal behavior are exceedingly hard to

find in rural areas of Denmark.

she is correct:

data here (danish):

The links between divorce, personality problems in the parents, and trou­

blesome behavior in the children are complex: the effects go every which way.

People with personality problems are difficult to live with so they’re more

likely to get divorced; the same people are more likely, for genetic reasons, to

have difficult kids. There might even be a child-to-parent effect: a difficult kid

can put a real strain on a marriage. Earlier in the chapter I quoted the joke

about Johnny, the kid who could break any home, but it is not funny if you

have a kid like Johnny. Some children make every member of the family wish

they could get out. Judith Wallerstein talks about the heavy load o f guilt the

children o f divorcing parents are burdened with—the kids think their parents’

divorce was their fault. What Wallerstein doesn’t consider is that sometimes

there may be an element of truth in what the kids think. Divorce occurs less

often in families that contain a son than in those that only have daughters.38

The presence of that boy either makes the parents happier or makes the father

more reluctant to walk out. But what if the boy is not a satisfying kid? What if

he is nothing but trouble?

didnt know that. altho im not supersurprised, since a lot of people have told me that they prefer to have male children. “easier to handle” they say.

I see it in the news all the time; it always makes me angry. The Smith kid gets

into trouble and the judge threatens to throw his parents in jail. The Jones kid

burglarizes a house and his parents are fined for their failure to “exercise rea­

sonable control” over his activities. The Williams kid gets pregnant and her

parents are criticized for not keeping track of where she was and what she was

doing. One set of parents, when they found it impossible to keep their teenage

daughter out of trouble, chained her to the radiator. They were arrested for

child abuse.61

cant win…

Good things tend to go together. So do bad things. These are correlations.

Educational psychologist Howard Gardner would have us believe that there

are several different “intelligences” and that someone who was stinted on one

might have gotten a generous helping of another. But the fact is that people

who score low on tests o f one kind of intelligence are also likely to score low

on tests of other kinds.68 We are pleased when we hear about a child who is

mentally retarded in most respects but who is a whiz at drawing or calculating:

it appeals to our sense of fairness. But such cases are uncommon. Far more

commonly, nature is unfair to mentally retarded children by giving them no

talents and making them physically clumsy as well. That is why they compete

in the Special Olympics instead of the regular Olympics.

“Everything is related to everything else,” said a psychologist whose spe­

cialty was statistics. He told the story of a pair of researchers who collected

data on 57,000 high school students in Minnesota. The researchers asked the

kids about their leisure activities and educational plans, whether they liked

school, and how many siblings they had. They asked about their fathers’

occupation, their mothers’ and fathers’ education, and their families’ atti­

tudes toward college. There were fifteen items in all and 105 possible correla­

tions between pairs of items.* All 105 yielded significant correlations, most at

levels of significance that would be expected by chance less than .000001 of

the time.69

with the power of n=57k, sure, one can find even very small correlations!

On the other hand, I don’t want to raise false hopes. So let me begin with a

true story, told by my late colleague David Lykken, about a pair of reared-apart

twins—one of the pairs studied at the University of Minnesota by the research

team of which he was a member.

They are identical twins separated in infancy; they grew up in different

adoptive homes. One became a concert pianist, talented enough to perform as

a soloist with the Minnesota Orchestra. The other cannot play a note.

Since these women have the same genes, the disparity must be due to a dif­

ference in their environments. Sure enough, one of the adoptive mothers was

a music teacher who gave piano lessons in her home. The parents who adopted

the other twin were not musical at all.

Only it was the nonmusical parents who produced the concert pianist and

the piano teacher whose daughter cannot play a note.1

Not that being rejected by one’s peers is the end of the world. It hurts like

hell while it’s happening and it does leave permanent scars, but it doesn’t

keep a kid from being socialized (you can identify even with a group that

rejects you), and I’ve noticed that many interesting people went through a

period of rejection during childhood. Or got moved around a lot, which has

similar effects. I was moved around a lot as a child and went through four

years of rejection, and there is no doubt that I would have been a different per­

son if it hadn’t happened. A more sociable person, but more superficial. Cer­

tainly not a writer o f books—a job that has as its first requirement the

willingness to spend a good deal of time alone. The biologist and author E. O.

Wilson recalls his childhood this way:

I was an only child whose family moved around quite a bit in southern Alabama

and northwestern Florida. I attended 14 different schools in 11 years. So it was,

perhaps, inevitable that I grew up as something of a solitary and found nature

my most reliable companion. In the beginning, nature provided adventure;

later, it was the source of much deeper emotional and aesthetic pleasure.17

I attended 4 different ground schools, so i fit the pattern as well.

Powered by WordPress