Book review Psychiatry Psychology Science

Genius: The Natural History of Creativity (Hans Eysenck, 1995)

I continue my Eysenck readings with his popular genius book (prior review The Psychology of Politics (1954)). Having previously read some of Simonton’s work, Eysenck sure is a very different beast! The writing style follows the usual style: candid, emphasizing of uncertainty when present, funny, and very wide ranging. In fact, regarding replication, Eysenck is almost modern, always asking for replications of experiments, and saying that it is a waste of time to do studies with n < 100!

I don’t have time to write a big review, but I have marked a bunch of interesting passages, and I will quote them here. Before doing so, however, the reader should know that there is now a memorial site for Hans Eysenck too, with free copies of his work. It’s not complete yet, his bibliography is massive! I host it, but it’s created by a 3rd person.

Let’s begin. Actually, I forgot to note interesting passages in the first half of the book, so these are all from second part. Eysenck discusses the role of the environment in origins of genius, and illustrates with an unlikely case:

Our hero was born during the American Civil War, son of Mary, a Negro slave on a Missouri farm owned by Moses Carver and his wife Susan. Mary, who was a widow, had two other children – Melissa, a young girl, and a boy, Jim; George was the baby. In 1862 masked bandits who terrorized the countryside and stole livestock and slaves attacked Carver’s farm, tortured him and tried to make him tell where his slaves were hidden; he refused to tell. After a few weeks they came back, and this time Mary did not have time to hide in a cave, as she had done the first time; the raiders dragged her, Melissa and George away into the bitter cold winter’s night. Moses Carver had them followed, but only George was brought back; the raiders had given him away to some womenfolk saying ‘he ain’t worth nutting’. Carver’s wife Susan nursed him through every conceivable childhood disease that his small frame seemed to be particularly prone to; but his traumatic experiences had brought on a severe stammer which she couldn’t cure. He was called Carver’s George; his true name (if such a concept had any meaning for a slave) is not known. When the war ended the slaves were freed, but George and Jim stayed with the Carvers. Jim was sturdy enough to become a shepherd and to do other farm chores; George was a weakling and helped around the house. His favourite recreation was to steal off to the woods and watch insects, study flowers, and become acquainted with nature. He had no schooling of any kind, but he learned to tend flowers and became an expert gardener. He was quite old when he saw his first picture, in a neighbour’s house; he went home enchanted, made some paint by squeezing out the dark juices of some berries, and started drawing on a rock. He kept on experimenting with drawings, using sharp stones to scratch lines on the smooth pieces of earth. He became known as the ‘plant doctor’ in the neighbourhood, although still only young, and helped everyone with their gardens.

At some distance from the farm there was a one-room cabin that was used as a school house during the week; it doubled as a church on Sundays. When George discovered its existence, he asked Moses Carver for permission to go there, but was told that no Negroes were allowed to go to that school. George overcame his shock at this news after a while; Susan Carver discovered an old spelling-book, and with her help he soon learned to read and write. Then he discovered that at Neosho, eight miles away, there was a school that would admit Negro children. Small, thin and still with his dreadful stammer, he set out for Neosho, determined to earn some money to support himself there. Just 14 years old, he made his home with a coloured midwife and washerwoman. ‘That boy told me he came to Neosho to find out what made hail and snow, and whether a person could change the colour of a flower by changing the seed. I told him he’d never find that out in Neosho. Maybe not even in Kansas City. But all the time I knew he’d find it out – somewhere.’ Thus Maria, the washerwoman; she also told him to call himself George Carver – he just couldn’t go on calling himself Carver’s George! By that name, he entered the tumbledown shack that was the Lincoln School for Coloured Children, with a young Negro teacher as its only staff member. The story of his fight for education against a hostile environment is too long to be told here; largely self-educated he finally obtained his Bachelor of Science degree at the age of 32, specialized in mycology (the study of fungus growths) became an authority in his subject, and finally accepted an invitation from Booker T. Washington, the foremost Negro leader of his day, to help him fund a Negro university. He accepted, and his heroic struggles to create an institute out of literally nothing are part of Negro history. He changed the agricultural and the eating habits of the South; he created single-handed a pattern of growing food, harvesting and cooking it which was to lift Negroes (and whites too!) out of the abject state of poverty and hunger to which they had been condemned by their own ignorance. And in addition to all his practical and teaching work, administration and speech-making, he had time to do creative and indeed fundamental research; he was one of the first scientists to work in the field of synthetics, and is credited with creating the science of chemurgy – ‘agricultural chemistry’. The American peanut industry is based on his work; today this is America’s sixth most important agricultural product, with many hundreds of by-products. He became more and more obsessed with the vision that out of agriculture and industrial waste useful material could be created, and this entirely original idea is widely believed to have been Carver’s most important contribution. The number of his discoveries and inventions is legion; in his field, he was as productive as Edison. He could have become a millionaire-many times over but he never accepted money for his discoveries. Nor would he accept an increase in his salary, which remained at the 125 dollars a month (£100 per year) which Washington had originally offered him. (He once declined an offer by Edison to work with him at a minimum annual salary of 100000 dollars.) He finally died, over 80, in 1943. His death was mourned all over the United States. The New York Herald Tribune wrote: ‘Dr, Carver was, as everyone knows, a Negro. But he triumphed over every obstacle. Perhaps there is no one in this century whose example has done more to promote a better understanding between the races. Such greatness partakes of the eternal.’ He himself was never bitter, in spite of all the persecutions he and his fellow-Negroes had to endure. ‘No man can drag me down so low as to make me hate him.’ This was the epitaph on his grave. He could have added fortune to fame, but caring for neither, he found happiness and honour in being helpful to the world.

On Simonton‘s model of creativity:

Simonton’s own theory is interesting but lacks any conceivable psychological support. He calls his theory a ‘two-step’ model; he postulates that each individual creator begins with a certain ‘creative potential’ defined by the total number of contributions the creator would be capable of producing in an unrestricted life span. (Rather like a woman’s supply of ova!) There are presumably individual differences in this initial creative potential, which Simonton hardly mentions in the development of his theory. Now each creator is supposed to use up his supply of creative potential by transforming potential into actual contributions. (There is an obvious parallel here with potential energy in physics.) This translation of creative potential into actual creative products implies two steps. The first involves the conversion of creative potential into creative ideation, in the second step these ideas are worked into actual creative contributions in a form that can be appreciated publicly (elaboration). It is further assumed that the rate at which ideas are produced is proportional to the creative potential at a given time, and that the rate of elaboration if ‘proportional to the number of ideas in the works’ (Simonton, 1984b; p. 110). Simonton turns these ideas into a formula which generates a curve which gives a correlation between predicted and observed values in the upper 90s (Simonton, 1983b). The theory is inviting, but essentially untestable – how would we measure the ‘creative potential’ which presumably is entirely innate, and should exist from birth? How could we measure ideation, or elaboration, independently of external events? Of course the curve fits observations beautifully, but then all the constants are chosen to make sure of such a fit! Given the general shape of the curve (inverted U), many formulae could be produced to give such a fit. Unless we are shown ways of independently measuring the variables involved, no proper test of any underlying psychological theory exists.

On geniuses misbehavior, featuring Newton:

Less often remarked, but possibly even more insidious, is the resistance by scientists to ‘scientific discovery’, as Barker (1961) has named this phenomenon. As he point out, in two systematic analyses of the social process of scientific discovery and invention, analyses which tried to be as inclusive of empirical facts and theoretical problems as possible, there was only one passing reference to such resistance in the one instance and none at all in the second (GilfiUan, 1935; Barker, 1952). This contrasts markedly with the attention paid to the resistance to scientific discovery on the part of economic, technological, religious ideological elements and groups outside science itself (Frank, 1957; Rossman, 1931; Shyrock, 1936; Stamp, 1937). This neglect is probably based on the erroneous notion embodied in the title of Oppenheimer‘s (1955) book The Open Mind; we assume all too readily that objectivity is the characteristic of the scientist, and that he will impartially consider all the available facts and theories. Polanyi (1958, 1966) has emphasized the importance of the personality of the scientist, and no one familiar with the history of science can doubt that individual scientists are as emotional, jealous, quirky, self-centred, excitable, temperamental, ardent, enthusiastic, fervent, impassioned, zealous and hostile to competition as anyone else. The incredibly bellicose, malevolent and rancorous behaviour of scientists engaged in disputes about priority illustrates the truth of this statement. The treatment handed out to Halton Arp (1987), who dared to doubt the cosmological postulate about the meaning and interpretation of the red-shift is well worth pondering (Flanders, 1993). Objectivity flies out of the window when self-interest enters (Hagstrom, 1974).

The most famous example of a priority dispute is that between Newton and Leibnitz, concerning the invention of the calculus (Manuel, 1968). The two protagonists did not engage in the debate personally, but used proxies, hangers-on who would use their vituperative talents to the utmost in the service of their masters. Newton in particular abused his powers as President of the Royal Society in a completely unethical manner. He nominated his friends and supporters to a theoretically neutral commission of the Royal Society to consider the dispute; he wrote the report himself, carefully keeping his own name out of it, and he personally persecuted Leibnitz beyond the grave, insisting that he had plagiarized Newton’s discovery – which clearly was untrue, as posterity has found. Neither scientist emerges with any credit from the Machiavellian controversy, marred by constant untruths, innuendos of a personal nature, insults, and outrageous abuse which completely obscured the facts of the case. Newton behaved similarly towards Robert Hooke, Locke, Flamsted and many others; as Manuel (1968) says. ‘Newton was aware of the mighty anger that smouldered within him all his life, eternally seeking objects. … many were the times when (his censor) was overwhelmed and the rage could not be contained’ (p. 343). ‘Even if allowances are made for the general truculence of scientists and learned men, he remains one of the most ferocious practitioners of the art of scientific controversy. Genteel concepts of fair play are conspicuously absent, and he never gave any quarter’ (p. 345). So much for scientific objectivity!

More Newton!:

Once a theory has been widely accepted, it is difficult to displace, even though the evidence against it may be overwhelming. Kuhn (1957) points out that even after the publication of De Revolutionibus most astronomers retained their belief in the central position of the earth; even Brahe (Thoren, 1990) whose observations were accurate enough to enable Kepler (Caspar, 1959) to determine that the Mars orbit around the sun was elliptical, not circular, could not bring himself to accept the heliocentric view. Thomas Young proposed a wave theory of light on the basis of good experimental evidence, but because of the prestige of Newton, who of course favoured a corpuscular view, no-one accepted Young’s theory (Gillespie, 1960). Indeed, Young was afraid to publish the theory under his own name, in case his medical practice might suffer from his opposition to the god-like Newton! Similarly, William Harvey’s theory of the circulation of the blood was poorly received, in spite of his prestigious position as the King’s physician, and harmed his career (Keele, 1965). Pasteur too was hounded because his discovery of the biological character of the fermentation process was found unacceptable. Liebig and many others defended the chemical theory of these processes long after the evidence in favour of Pasteur was conclusive (Dubos, 1950). Equally his micro-organism theory of disease caused endless strife and criticism. Lister’s theory of antisepsis (Fisher, 1977) was also long argued over, and considered absurd; so were the contributions of Koch (Brock, 1988) and Erlich (Marquardt, 1949). Priestley (Gibbs, 1965) retained his views of phlogiston as the active principle in burning, and together with many others opposed the modern theories of Lavoisier, with considerable violence. Alexander Maconochie’s very successful elaboration and application of what would now be called ‘Skinnerian principle’ to the reclamation of convicted criminals in Australia, led to his dismissal (Barry, 1958).

But today is different! Or maybe not:

The story is characteristic in many ways, but it would be quite wrong to imagine that this is the sort of thing that happened in ancient, far-off days, and that nowadays scientists behave in a different manner. Nothing has changed, and I have elsewhere described the fates of modern Lochinvars who fought against orthodoxy and were made to suffer mercilessly (Eysenck, 1990a). The battle against orthodoxy is endless, and there is no chivalry; if power corrupts (as it surely does!), the absolute power of the orthodoxy in science corrupts absolutely (well, almost!). It is odd that books on genius seldom if ever mention this terrible battle that originality so often has when confronting orthodoxy. This fact certainly accounts for some of the personality traits so often found in genius, or even the unusually creative non-genius. The mute, inglorious Milton is a contradiction in terms, an oxymoron; your typical genius is a fighter, and the term ‘genius’ is by definition accorded the creative spirit who ultimately (often long after his death) wins through. An unrecognized genius is meaningless; success socially defined is a necessary ingredient. Recognition may of course be long delayed; the contribution of Green (Connell, 1993) is a good example.

On fraud in science, after discussing Newton’s fudging of data, and summarizing Kepler‘s:

It is certainly startling to find an absence of essential computational details because ‘taediesum esset’ to give them. But worse is to follow. Donahue makes it clear that Kepler presented theoretical deduction as computations based upon observation. He appears to have argued that induction does not suffice to generate true theories, and to have substituted for actual observations figures deduced from the theory. This is historically interesting in throwing much light on the origins of scientific theories, but is certainly not a procedure recommended to experimental psychologists by their teachers!

Many people have difficulties in understanding how a scientist can fraudulently ‘fudge’ his data in this fashion. The line of descent seems fairly clear. Scientists have extremely high motivation to succeed in discovering the truth; their finest and most original discoveries are rejected by the vulgar mediocrities filling the ranks of orthodoxy. They are convinced that they have found the right answer; Newton believed it had been vouchsaved him by God, who explicitly wanted him to preach the gospel of divine truth. The figures don’t quite fit, so why not fudge them a little bit to confound the infidels and unbelievers? Usually the genius is right, of course, and we may in retrospect excuse his childish games, but clearly this cannot be regarded as a licence for non-geniuses to foist their absurd beliefs on us. Freud is a good example of someone who improved his clinical findings with little regard for facts (Eysenck, 1990b), as many historians have demonstrated. Quod licet Jovi non licet bovi – what is permitted to Jupiter is not allowed the cow!

One further point. Scientists, as we shall see, tend to be introverted, and introverts show a particular pattern of level of aspiration (Eysenck, 1947) – it tends to be high and rigid. That means a strong reluctance to give up, to relinquish a theory, to acknowledge defeat. That, of course, is precisely the pattern shown by so many geniuses, fighting against external foes and internal problems. If they are right, they are persistent; if wrong, obstinate. As usual the final result sanctifies the whole operation (fudging included); it is the winners who write the history books!

The historical examples would seem to establish the importance of motivational and volitional factors, leading to persistence in opposition against a hostile world, and sometimes even to fraud when all else fails. Those whom the establishment refuses to recognize appropriately fight back as best they can; they should not be judged entirely by the standards of the mediocre!

This example and the notes about double standards for genius is all the more interesting in the recent light of problems with Eysenck’s own studies, published with yet another maverick!

And now, to sunspots and genius:

Ertel used recorded sun-spot activity going back to 1600 or so, and before that by analysis of the radiocarbon isotope CI4, whose productions as recorded in trees, which give an accurate picture of sun-spot activity. Plotted in relation to sun-spot activity were historical events, either wars, revolutions, etc. or specific achievements in painting, drama, poetry, science and philosophy. Note that Ertel’s investigations resemble a ‘double blind’ paradigm, in that the people who determined the solar cycle, and those who judged the merits of the artists and scientists in question, were ignorant of the purpose to which Ertel would put their data, and did not know anything about the theories involved. Hence the procedure is completely objective, and owes nothing to Ertel’s views, which in any case were highly critical of Chizhevsky’s ideas at the beginning.

The irregularities of the solar cycle present difficulties to the investigator, but also, as we shall see, great advantages. One way around this problem was suggested by Ertel; it consists of looking at each cycle separately; maximum solar activity (sol. max.) is denoted 0, and the years preceding or succeeding 0 are marked -1, -2, -3 etc., or +1, +2, +3 etc. Fig. 4.2 shows the occurrence of 17 conflicts between socialist states from 1945 to 1982, taken from a table published by the historian Bebeler, i.e. chosen in ignorance of the theory. In the figure the solid circles denote the actual distribution of events, the empty circles the expected distribution on a chance basis. Agreement with theory is obvious, 13 of the 17 events occurring between – 1 and + 1 solar maximum (Ertel, 1992a,b).

Actually Ertel bases himself on a much broader historical perspective, having amassed 1756 revolutionary events from all around the world, collected from 22 historical compendia covering the times from 1700 to the present. There appears good evidence in favour of Chizhevsky’s original hypothesis. However, in this book we are more concerned with Ertel’s extension to cultural events, i.e. the view that art and science prosper most when solar activity is at a minimum. Following his procedure regarding revolutionary events, Ertel built up a data bank concerned with scientific discoveries. Fig. 4.3 shows the outcome; the solid lines show the relation between four scientific disciplines and solar activity, while the black dots represent the means of the four scientific disciplines. However, as Ertel argues, the solar cycle may be shorter or longer than 11 years, and this possibility can be corrected by suitable statistical manipulation; the results of such manipulation, which essentially records strength of solar activity regardless of total duration of cycle, are shown on the right. It will be clear that with or without correction for duration of the solar cycle, there is a very marked U-shaped correlation with this activity, with an average minimum of scientific productivity at points -1,0 and -I-1, as demanded by the theory.

Intriguing! Someone must have tested this stuff since. It should be easy to curate a large dataset from Murray’s Human Accomplishment or Wikipedia based datasets, and see if it holds up. More generally, it is somewhat in line with quantitative historical takes by clio-dynamics people.

Intuition vs. thinking, system 1 vs. 2, and many other names:

It was of course Jung (1926) who made intuition one of the four functions of his typology (in addition to thinking, feeling, and sensation). This directed attention from the process of intuition to intuition as a personality variable – we can talk about the intuitive type as opposed to the thinking type, the irrational as opposed to the rational. (Feeling, too, is rational, while sensation is irrational, i.e. does not involve a judgment.) Beyond this, Jung drifts of Tinto the clouds peopled with archetypes and constituted of the ‘collective unconscious’, intuitions of which are held to be far more important than intuitions of the personal unconscious. Jung’s theory is strictly untestable, but has been quite important historically in drawing attention to the intuitive person’, or intuition as a personality trait.

Jung, like most philosophers, writers and psychologists, uses the contrast between ‘intuition’ and logic’ as an absolute, a dichotomy of either – or. Yet when we consider the definitions and uses of the terms, we find that we are properly dealing with a continuum, with ‘intuition’ and logic’ at opposite extremes, rather like the illustration in Fig. 5.2. In any problem-solving some varying degree of intuition is involved, and that may be large or small in amount. Similarly, as even Jung recognized, people are more or less intuitive; personalities are ranged along a continuum. It is often easier to talk as if we were dealing with dichotomies (tall vs. short; bright vs. dumb; intuitive vs. logical), but it is important to remember that this is not strictly correct; we always deal with continua.

The main problem with the usual treatment of ‘intuition’ is the impossibility of proof; whatever is said or postulated is itself merely intuitive, and hence in need of translation into testable hypotheses. Philosophical or even common- sense notions of intuition, sometimes based on experience as in the case of Poincare, may seem acceptable, but they suffer the fate of all introspection – they may present us with a problem, but do not offer a solution.

The intuitive genius of Ramanujan:

For Hardy, as Kanigel says, Ramanujan’s pages of theorems were like an alien forest whose trees were familiar enough to call trees, yet so strange they seemed to come from another planet. Indeed, it was the strangeness of Ramanujan’s theorems, not their brilliance, that struck Hardy first. Surely this was yet another crank, he thought, and put the letter aside. However, what he had read gnawed at his imagination all day, and finally he decided to take the letter to Littlewood, a mathematical prodigy and friend of his. The whole story is brilliantly (and touchingly) told by Kanigel; fraud or genius, they asked themselves, and decided that genius was the only possible answer. All honour to Hardy and Littlewood for recognizing genius, even under the colourful disguise of this exotic Indian plant; other Cambridge mathematicians, like Baker and Hobson, had failed to respond to similar letters. Indeed, as Kanigel says, ‘it is not just that he discerned genius in Ramanujan that stands to his credit today; it is that he battered down his own wall of skepticism to do so’ (p. 171).

The rest of his short life (he died at 33) Ramanujan was to spend in Cambridge, working together with Hardy who tried to educate him in more rigorous ways and spent much time in attempting to prove (or disprove!) his theorems, and generally see to it that his genius was tethered to the advance- ment of modern mathematics. Ramanujan’s tragic early death left a truly enormous amount of mathematical knowledge in the form of unproven theorems of the highest value, which were to provide many outstanding mathematicians with enough material for a life’s work to prove, integrate with what was already known, and generally give it form and shape acceptable to orthodoxy. Ramanujan’s standing may be illustrated by an informal scale of natural mathematical ability constructed by Hardy, on which he gave himself a 25 and Littlewood a 30. To David Hilbert, the most eminent mathematician of his day, he gave an 80. To Ramanujan he gave 100! Yet, as Hardy said:

the limitations of his knowledge were as startling as its profundity. Here was man who could work out modular equations and theorems of complex multiplication, to orders unheard of, whose mastery of continued fractions was, on the formal side at any rate, beyond that of any mathematician in the world, who had found for himself the functional equation of the Zeta- function, and the dominant terms of many of the most famous problems in the analytical theory of numbers; and he had never heard of a doubly periodic function or of Cauchy’s theorem, and had indeed but the vaguest idea of what a function of a complex variable was. His ideas as to what constituted a mathematical proof were of the most shadowy description. All his results, new or old, right or wrong, had been arrived at by a process of mingled arguments, intuition, and induction, of which he was entirely unable to give any coherent account (p. 714).

Ramanujan’s life throws some light on the old question of the ‘village Hampden’ and ‘mute inglorious Milton’; does genius always win through, or may the potential genius languish unrecognized and undiscovered? In one sense the argument entails a tautology: if genius is defined in terms of social recognition, an unrecognized genius is of course a contradicto in adjecto. But if we mean, can a man who is a potential genius be prevented from demonstrating his abilities?, then the answer must surely be in the affirmative. Ramanujan was saved from such a fate by a million-to-one accident. All his endeavours to have his genius recognized in India had come to nothing; his attempts to interest Baker and Hobson in Cambridge came to nothing; his efforts to appeal to Hardy almost came to nothing. He was saved by a most unlikely accident. Had Hardy not reconsidered his first decision, and consulted Littlewood, it is unlikely that we would ever have heard of Ramanujan! How many mute inglorious Miltons (and Newtons, Einsteins and Mendels) there may be we can never know, but we may perhaps try and arrange things in such a way that their recognition is less likely to be obstructed by bureaucracy, academic bumbledom and professional envy. In my experience, the most creative of my students and colleagues have had the most difficulty in finding recognition, acceptance, and research opportunities; they do not fit in, their very desire to devote their lives to research is regarded with suspicion, and their achievements inspire envy and hatred.

Eysenck talks about his psychoticism construct, which is almost the same as the modern general psychopathology factor, both abbreviated to P:

The study was designed to test Kretschmer’s (1946, 1948) theory of a schizothymia-cyclothymia continuum, as well as my own theory of a norma- lity-psychosis continuum. Kretschmer was one of the earliest proponents of a continuum theory linking psychotic and normal behaviour. There is, he argued, a continuum from schizophrenia through schizoid behaviour to normal dystonic (introverted) behaviour; on the other side of the continuum we have syntonic (extraverted) behaviour, cycloid and finally manic-depres- sive disorder. He is eloquent in discussing how psychotic abnormality shades over into odd and eccentric behaviour and finally into quite normal typology. Yet, as I have pointed out (Eysenck, 1970a,b), the scheme is clearly incom- plete. We cannot have a single dimension with ‘psychosis’ at both ends; we require at least a two dimensional scheme, with psychosis-normal as one axis, and schizophrenia-affective disorder as the other.

In order to test this hypothesis, I designed a method of ‘criterion analysis’ (Eysenck, 1950, 1952a,b), which explicitly tests the validity of continuum vs. categorical theories. Put briefly, we take two groups (e.g. normal vs. psycho- tic), and apply to both objective tests which significantly discriminate between the groups. We then intercorrelate the tests within each group, and factor analyse the resulting matrices. If and only if the continuum hypothesis is correct will it be found that the factor loadings in both matrices will be similar or identical, and that these loading will be proportional to the degree to which the various tests discriminate between the two criterion groups.

An experiment has been reported, using this method. Using 100 normal controls, 50 schizophrenics and 50 manic-depressives, 20 objective tests which had been found previously to correlate with psychosis were applied to all the subjects (Eysenck, 1952b). The results clearly bore out the continuum hypothesis. The two sets of factor loadings correlated .87, and both were proportional to the differentiating power of the tests r = .90 and .95, respecti- vely). These figures would seem to establish the continuum hypotheses quite firmly; the results of the experiment are not compatible with a categorical type of theory.

Eysenck summarizes his model:

Possessing this trait, however, does not guarantee creative achievement. Trait creativity may be a necessary component of such achievement, but many other conditions must be fulfilled, many other traits added (e.g. ego-strength), many abilities and behaviours added (e.g. IQ, persistence), and many socio- cultural variables present, before high creative achievement becomes prob- able. Genius is characterized by a very rare combination of gifts, and these gifts function synergistically, i.e. they multiply rather than add their effects. Hence the mostly normally distributed conditions for supreme achievement interact in such a manner as to produce a J-shaped distribution, with huge numbers of non- or poor achievers, a small number of high achievers, and the isolated genius at the top.

This, in very rough outline, is the theory here put forward. As discussed, there is some evidence in favour of the theory, and very little against it. Can we safely say that the theory possesses some scientific credentials, and may be said to be to some extent a valid account of reality? There are obvious weaknesses. Genius is extremely rare, and no genius has so far been directly studied with such a theory in mind. My own tests have been done to study deductions from the theory, and these have usually been confirmatory. Is that enough, and how far does it get us?

Terms like ‘theory’, of course, are often abused. Thus Koestler (1964) attempts to explain creativity in terms of his theory of’bisociation’ according to which the creative act ‘always operates on more than one plane’ (p. 36). This is not a theory, but a description; it cannot be tested, but acts as a definition. Within those limits, it is acceptable as a non-contingent proposition (Smets- lund, 1984), i.e. necessarily true and not subject to empirical proof. A creative idea must, by definition, bring together two or more previously unrelated concepts. As an example, consider US Patent 5,163,447, the ‘force-sensitive, sound-playing condom’, i.e. an assembly of a piezo-electric sound transducer, microchip, power supply and miniature circuitry in the rim of a condom, so that when pressure is applied, it emits ‘a predetermined melody or a voice message’. Here is bisociation in its purest form, bringing together mankind’s two most pressing needs, safe sex and eternal entertainment. But there is no proper theory here; nothing is said that could be disproved by experiment. Theory implies a lot more than simple description.

And he continues outlining his own theory of scientific progress:

The philosophy of science has thrown up several criteria for judging the success of a theory in science. All are agreed that it must be testable, but there are two alternative ways of judging the outcome of such tests. Tradition (including the Vienna school) insists on the importance of confirmation’, the theory is in good shape as long as results of testing deductions are positive (Suppe, 1974). Popper (1959, 1979), on the other hand, uses falsification as his criterion, pointing out that theories can never be proved to be correct, because we cannot ever test all the deductions that can possibly be made. More recent writers like Lakatos (1970, 1978; Lakatos and Musgrave, 1970) have directed their attention rather at a whole research programme, which can be either advancing or degenerating. An advancing research programme records a number of successful predictions which suggest further theoretical advances; a degenerating research programme seeks to excuse its failures by appealing to previously unconsidered boundary conditions. On those terms we are surely dealing with an advancing programme shift; building on research already done, many new avenues are opening up for supporting or disproving the theories making up our model.

It has always seemed to me that the Viennese School, and Popper, too, were wrong in disregarding the evolutionary aspect of scientific theories. Methods appropriate for dealing with theories having a long history of development might not be optimal in dealing with theories in newly developing fields, lacking the firm sub-structure of the older kind. Newton, as already men- tioned, succeeded in physics, where much sound knowledge existed in the background, as well as good theories; he failed in chemistry/alchemy where they did not. Perhaps it may be useful to put forward my faltering steps in this very complex area situated between science and philosophy (Eysenck, 1960, 1985b).

It is agreed that theories can never be proved right, and equally that they are dependent on a variety of facts, hunches and assumptions outside the theory itself; these are essential for making the theory testable. Cohen and Nagel (1936) put the matter very clearly, and take as their example Foucault’s famous experiment in which he showed that light travels faster in air than in water. This was considered a crucial experiment to decide between two hypotheses: H1? the hypothesis that light consists of very small particles travelling with enormous speeds, and H2 , the hypothesis that light is a form of wave motion. H1 implies the proposition Pl that the velocity of light in water is greater than in air, while H2 implies the proposition P2 that the velocity of light in water is less than in air. According to the doctrine of crucial experiments, the corpuscular hypothesis of light should have been banished to limbo once and for all. However, as is well known, contemporary physics has revived the corpuscular theory in order to explain certain optical effects which cannot be explained by the wave theory. What went wrong?

As Cohen and Nagel point out, in order to deduce the proposition P1 from H1 and in order that we may be able to perform the experiment of Foucault, many other assumptions, K, must be made about the nature of light and the instruments we employ in measuring its velocity. Consequently, it is not the hypothesis H1 alone which is being put to the test by the experiment – it is H1 and K. The logic of the crucial experiment may therefore be put in this fashion. If Hl and K, then P1; if now experiment shows P1 to be false, then either Hl is false or K (in part or complete) is false (or of course both may be false!). If we have good grounds for believing that K is not false, H1 is refuted by the experiment. Nevertheless the experiment really tests both H1 and K. If in the interest of the coherence of our knowledge it is found necessary to revise the assumptions contained in K, the crucial experiment must be reinterpreted, and it need not then decide against H1.

What I am suggesting is that when we are using H + K to deduce P, the ratio of H to K will vary according to the state of development of a given science. At an early stage, K will be relatively little known, and negative outcomes of testing H + K will quite possibly be due to faulty assumptions concerning K. Such theories I have called ‘weak’, as opposed to ‘strong’ theories where much is known about K, so that negative outcomes of testing H + K are much more likely to be due to errors in H (Eysenck, 1960, 1985b).

We may now indicate the relevance of this discussion to our distinction between weak and strong theories. Strong theories are elaborated on the basis of a large, well founded and experimentally based set of assumptions, K, so that the results of new experiments are interpreted almost exclusively in terms of the light they throw on H1, H2, …, Hn . Weak theories lack such a basis, and negative results of new experiments may be interpreted with almost equal ease as disproving H or disproving K. The relative importance of K can of course vary continuously, giving rise to a continuum; the use of the terms ‘strong’ and ‘weak’ is merely intended to refer to the extremes of this continuum, not to suggest the existence of two quite separate types of theories. In psychology, K is infinitely less strong than it is in physics, and consequently theories in psychology inevitably lie towards the weaker pole.

Weak theories in science, then, generate research the main function of which is to investigate certain problems which, but for the theory in question, would not have arisen in that particular form; their main purpose is not to generate predictions the chief use of which is the direct verification or confirmation of the theory. This is not to say that such theories are not weakened if the majority of predictions made are infirmed; obviously there comes a point when investigators turn to more promising theories after consistent failure with a given hypothesis, however interesting it may be. My intention is merely to draw attention to the fact – which will surely be obvious to most scientifically trained people – that both proof and failure of deductions from a scientific hypothesis are more complex than may appear at first sight, and that the simple-minded application of precepts derived from strong theories to a field like psychology may be extremely misleading. Ultimately, as Conant has emphasized, scientific theories of any kind are not discarded because of failures of predictions, but only because a better theory has been advanced.

The reader with a philosophy background will naturally think of this in terms of The Web of Belief, which takes us back to my earlier days of blogging philosophy!

intelligence / IQ / cognitive ability Psychiatry

Pedophilia and related traits

The question of whether homosexuality is related to various bad things is independent of one’s stance towards homosexuals as citizens, their rights to marriage etc. which is fine by me.

Given some recent tweets about the penile response of a large, clinical sample of men to various stimuli, there was some Twitter speculation that homosexuality and pedophilia were linked. Tara McCarthy asked me to look into the matter, so I took a quick look. This is not meant to be exhaustive.


I have previously shown (never formally published, not even a preprint, but there’s an R notebook) that there is a general sexual kink or deviance factor, meaning that interest in any kink/paraphilia increases the likelihood that someone is interested in any other kink, no matter which pair is chosen. I was unfortunately unable to confirm this finding in larger samples (requests for data had no reply), but the factor loadings are shown below. I analyzed the data by sex because I had the correlation matrix for both sexes, and because it was plausible that the relationships between the interest may differ by sex (i.e. interaction). However, the pattern was roughly the same:

Pedophilia is included but with a relatively low loading: about .30 for women and .45 for men. These differences may reflect chance. The sample size was large-ish at n=1226, about 2/3 female. If one assumes independent correlations, the p value for the test of different correlations is .03 (using psych::paired.r) which is not overwhelming. The low loading of some items is not due to variance bias effects as all measures were based on a composite score of 32 of 7-point items.

Another way to think about it is to think of interests in these kinks as deviance from normal/natural sexual preferences (heterosexual vaginal). In this fashion, both pedophilia and homosexuality are deviances and would thus per the general factor interpretation be expected to be correlated. However, we don’t need to trust theory here because there’s a bunch of studies on the topic:

Since gays only constitute some 3%ish percent of men, but constitute 27% of this offender sample, there does seem to be a rather strong link (RR = 900%). Bisexuals are also very over-represented. The sample is quite large at n=991 and is based on archival data for convicted offenders.

Objective. To determine if recognizably homosexual adults are frequently accused of the sexual molestation of children.

Design. Chart review of medical records of children evaluated for sexual abuse.

Setting. Child sexual abuse clinic at a regional children’s hospital.

Patients. Patients were 352 children (276 girls and 76 boys) referred to a subspecialty clinic for the evaluation of suspected child sexual abuse. Mean age was 6.1 years (range, 7 months to 17 years).

Data collected. Charts were reviewed to determine the relationships of the children to the alleged offender, the sex of the offender, and whether or not the alleged offender was reported to be gay, lesbian, or bisexual.

Results. Abuse was ruled out in 35 cases. Seventy-four children were allegedly abused by other children and teenagers less than 18 years old. In 9 cases, an offender could not be identified. In the remaining 269 cases, two offenders were identified as being gay or lesbian. In 82% of cases (222/269), the alleged offender was a heterosexual partner of a close relative of the child. Using the data from our study, the 95% confidence limits, of the risk children would identify recognizably homosexual adults as the potential abuser, are from 0% to 3.1%. These limits are within current estimates of the prevalence of homosexuality in the general community.

Conclusions. The children in the group studied were unlikely to have been molested by identifiably gay or lesbian people.

Thus, it found approximately nothing using a different approach.

A random sample of 175 males convicted of sexual assault against children was screened with reference to their adult sexual orientation and the sex of their victims. The sample divided fairly evenly into two groups based on whether they were sexually fixated exclusively on children or had regressed from peer relationships. Female children were victimized nearly twice as often as male children. All regressed offenders, whether their victims were male or female children, were heterosexual in their adult orientation. There were no examples of regression to child victims among peer-oriented, homosexual males. The possibility emerges that homosexuality and homosexual pedophilia may be mutually exclusive and that the adult heterosexual male constitutes a greater risk to the underage child than does the adult homosexual male.

Also about null, at least in the authors’ interpretation. Though I note that it is a bit weird that they find them all to be heterosexual in adult orientation, while male children were about 33% of the targets. Note that the first study above used the sex of the victims to define the sexual orientation, while this one did not.

Described as:

“This dataset included measures taken with the same laboratory method used by Freund et al., namely, phallometric testing. Phallometric testing (sometimes called penile plethysmography) is an objective technique for assessing erotic interests in men. In phallometric tests for gender and age orientation, the individual’s penile blood volume is monitored while he is presented with a standardized sequence of laboratory stimuli depicting male and female children and adults. Increases in the patient’s penile blood volume (i.e., degrees of penile erection) are used as the measure of his attraction to different classes of persons.”

In other words, sex researchers studying these men strap a device over the subjects’ penises to measure how swollen or flaccid their penises become in response to various kinds of pictures and audiotapes. The idea is that, the more erect the penis, the more the man has been stimulated by the particular sexual material being presented to him. Blanchard further describes the subject population of his recent study:

“The subjects were 2,278 male patients referred to a specialty clinic for phallometric assessment of their erotic preferences. All underwent the same test, which measured their penile responses to six classes of stimuli: prepubescent girls, pubescent girls, adult women, prepubescent boys, pubescent boys, and adult men. The stimuli were not, of course, live persons, but rather audiotaped narratives describing sexual interactions with prepubescent girls, pubescent girls, and so on. These narratives were accompanied by slides showing nude models who corresponded in age and gender to the topic of the narrative. The slides did not show the models doing anything sexual or even suggestive but rather resembled photographic illustrations of physical maturation in a medical textbook.”

So, the prediction from the model here would be that homosexuals should have an elevated young age response. The figure looks like this.

They look quite mirrored. Given the absence of the sample sizes, one cannot really infer anything about the homosexuality x pedophilia link. I was unable to find the actual study (link was dead).

All in all, there does seem to be a link.

The differential psychology of sexual offenders

While looking for the above studies, I stumbled upon some other interesting studies:

The EPQ and a lifestyle questionnaire were completed by 77 members of the Paedophile Information Exchange (PIE), a self-help club for men who are sexually attracted to children. Compared with control males the paedophiles were significantly introverted and high on P and N. Examination of individual items revealed that PIE members were more likely to be shy, sensitive, lonely, depressed and humourless, but they were not particularly troubled by guilt, obsessionality or worry about their looks. Individual variations within the paedophile sample were also found. Those who were high on P and low on E were interested in younger children and were less able to contemplate sex with adults. Paedophiles high on N were less happy about their sexual preference and were more likely to have sought treatment.

A sample of 473 male patients with pedophilia (assessed by the patients’ sexual history and penile response in the laboratory to standardized, erotic stimuli) or other problematic sexual interests or behaviors received brief neuropsychological assessments. Neuropsychological measures included a short form of the Wechsler Adult Intelligence Scale–Revised (D. Wechsler, 1981), the Hopkins Verbal Learning Test–Revised (R. H. B. Benedict, D. Schretlen. L. Groninger. & J. Brandt, 1998), the Brief Visuospatial Memory Test–Revised (R. H. B. Benedict, 1997), and the Edinburgh Handedness Inventory (S. M. Williams, 1986). Pedophilia showed significant negative correlations with IQ and immediate and delayed recall memory. Pedophilia was also related to non-right-handedness even after covarying age and IQ. These results suggest that pedophilia is linked to early neurodevelopmental perturbations. (PsycINFO Database Record (c) 2016 APA, all rights reserved)

We also have from the Blanchard study above:

And from a later study:

The samples overlap somewhat I think. Still, there does seem to be an age of interest x IQ interaction, consistent with a kind of wicked (one-way) assortative mating. Dull men are cognitive more like children, and the more dull, the closer to the mental level of children they are, and would thus be better able to bond with them.

Blanchard summarizes his account in his paper:

The Neurodevelopmental Hypothesis of Pedophilia
Building on clinical and systematic observations going back more than a century (e.g., von Krafft-Ebing 1965 ), Blanchard et al. ( 2002 ) hypothesized that neurodevelopmental problems in prenatal life or early childhood increase the risk of pedophilia in males. If this hypothesis is correct, then pedophiles should show other signs of perturbed neurodevelopment. Two such signs are of special interest, because both are associated with a very wide range of neurodevelopmental insults or stresses. The first such sign is poor cognitive function (or lower than expected IQ). The evidence that poor cognitive function can result from a variety of adverse neurodevelopmental events or conditions includes several lines of research. First, acquired neurologic damage during infancy or early childhood has profound and long-lasting cognitive effects. This has been demonstrated among children with brain tumors (Radcliffe et al. 1994 ), traumatic brain injury (Taylor et al. 1999 ), intracranial hemorrhage (Dennis and Barnes 1994 ), perinatal hypoxia (Gottfried 1973 ), and epilepsy (Neyens et al. 1999 ). Second, exposure to neurotoxic substances, either in utero or early in postnatal development, can have similar robust effects on cognition. Such effects have been associated with several teratogenic substances including lead (Needleman et al. 1990 ), coumarins (Wesseling et al. 2001 ), alcohol (Olson et al. 1998 ), and tobacco (Frydman 1996 ). Third, genetic disorders, with known adverse neurobiological effects, have also been connected with low cog- nitive functioning. For example, children with fragile X syndrome (Fisch et al. 1996 ), velocardiofacial syndrome (Kozma 1998 ), and Down syndrome (Hayes and Batshaw 1993 ) typically demonstrate significant intellectual impairment. The second nonspecific sign of neurodevelopmental problems in utero is non-right- handedness (i.e., left-handedness, or substantial use of both hands for common tasks, especially writing). Non-right-handedness occurs 1.5 – 3.0 times more frequently in populations with any of several neurological disorders. Such disorders include Down Syndrome (e.g., Batheja and McManus 1985 ), epilepsy (e.g., Schachter et al. 1995 ), autism (e.g., Soper et al. 1986 ), learning disabilities and dyslexia (e.g., Cornish and McManus 1996 ), and mental retardation (e.g., Grouios et al. 1999 ). Elevated levels of non-right-handedness have also been shown to be associated with biological stresses occurring pre- and perinatally, achieving frequencies of non-right-handedness compa- rable to those in the aforementioned pervasive developmental disorders (e.g., Searleman et al. 1988 ). Such pre- and perinatal stressors include premature birth (e.g., Marlow et al. 1989 ;Rossetal. 1992 ), twinning and multiple births (e.g., Coren 1994 ;Davis and Annett 1994 ; Williams et al. 1992 ), and low birth weight (e.g., O ’ Callaghan et al. 1987 ; Powls et al. 1996 ). It must be stressed that Blanchard et al. ( 2002 ) did not hypothesize that poor cognitive functioning (or non-right-handedness) causes pedophilia. They predicted, rather, that pedophilia will correlate with poor cognitive functioning and non-right- handedness if neurodevelopmental problems predispose a male to develop all three. They also did not hypothesize that neurodevelopmental problems are the only causes of pedophilia — simply that they contribute to the risk of this disorder.

Empirical Findings on Pedophilia, IQ, and Handedness
For several years, we have been testing predictions from the hypothesis of Blanchard et al. ( 2000 ) in an ongoing research program. The main published studies are presented here. In the most comprehensive individual investigation of IQ in pedophiles published to date, Cantor et al. ( 2004 ) examined a heterogeneous group of 454 men undergoing clinical assessment for various sexual offenses or problematic interests. The sexological variables included the patients ’ numbers of victims in each of several age groups, their numbers of consenting adult sexual partners, and their penile responses in the laboratory to standardized stimuli depicting males and females of various ages (i.e., phallometric test results). Analyses revealed lower IQ scores to be related to greater numbers of child victims, and higher IQ scores to be related to greater numbers of consenting, adult sexual partners. Similarly, lower IQ scores were associated with greater phallometric responses to sexual stimuli involving children, and higher IQ scores were associated with greater responses to stimuli involving adults. The subjects also demonstrated significant group differences in IQ when trichotomized on the basis of their phallometric test results into pedophiles, hebephiles, and teleiophiles. The mean IQs of these groups were 89.5, 93.7, and 97.8, respectively. It should be noted that the pedophiles ’ mean IQ was not in the mentally retarded range, but it was two-thirds of a standard deviation lower than the population mean of 100. The findings of Cantor et al. ( 2004 ) were supported by a subsequent meta-analysis of IQ data in sex offenders (Cantor et al. 2005a ). This study, among other things, compared the IQ scores of 56 samples of adult sexual offenders against children (3,187 individuals), 8 samples of sexual offenders against adults (302 individuals), and 53 samples of nonsexual offenders (16,222 men convicted of nonsexual crimes). The sexual offenders against children had significantly lower IQs than the nonsexual offenders; the sexual offenders against adults were intermediate. Cantor et al. ( 2005a ) also found that the mean IQs of samples of sexual offenders against children were related to the criterion used for defining the offender ’ svictimasa child: For example, the mean IQs from samples of men who offended against victims age 13 or younger were lower than the IQs from samples of men who offended against victims age 17 or younger. The relation between handedness and pedophilia has been examined in three published studies. The first relevant data were produced by Bogaert ( 2001 ). He found a slightly but significantly higher rate of non-right-handedness in a sample of sexual offenders against (unrelated) children under age 12 compared with a sample of controls. Bogaert ’ s finding was confirmed by Cantor et al. ( 2004 ) and Cantor et al. ( 2005b ), who assessed pedophilic interest with more extensive offense-history data than was available to Bogaert and also with phallometric testing. Cantor et al. ( 2005b ) found that the rate of non-right-handedness in pedophilic men was nearly triple that in teleiophilic men.

If we accept the common cause of this involves developmental instability, it is not clear whether the cause is merely genetic confounding (there’s always genetic confounding, fifth law). But if the maternal age is causal, it doesn’t bode well for the future since maternal age is increasing in conjunction with the dysgenic effects and dysfertility. If it’s not a genetic effect, it means genetic engineering and selection won’t be so effective.


What exactly is the value of psychotherapists?

This must be the most hilariously trolly study I’ve seen in a while.

Understanding the Therapist Contribution to Psychotherapy Outcome: A Meta-Analytic Approach

Understanding the role that therapists play in psychotherapy outcome, and the contribution to outcome made by individual therapist differences has implications for service delivery and training of therapists. In this study we used a novel approach to estimate the magnitude of the therapist contribution overall and the effect of individual therapist differences. We conducted a meta-analysis of studies in which participants were randomised to receive the same treatment either through self-help or through a therapist. We identified a total of 15 studies (commencement N = 910; completion N = 723) meeting inclusion criteria. We found no difference in treatment completion rate and broad equivalence of treatment outcomes for participants treated through self-help and participants treated through a therapist. Also, contrary to our expectations, we found that the variability of outcomes was broadly equivalent, suggesting that differences in efficacy of individual therapists were not sufficient to make therapy outcomes more variable when a therapist was involved. Overall, the findings suggest that self-help, with minimal therapist input, has considerable potential as a first-line intervention. The findings did not suggest that individual differences between therapists play a major role in psychotherapy outcome.

Authors: “Hey psychotherapy people! Yeah you! You don’t seem to do anything of value. Sincerely, science.”, “PS. Maybe get a real job.”

This is on top of loads of data telling us that these people can’t make better clinical predictions than simple algorithms. In fact, algorithms usually beat them.

Of course, the sample size here is not impressively large, and so they might simply have missed a small but real effect. Still, one would expect publication bias to produce them something positive to find, but apparently that was not the case.

Psychiatry Psychology

Is miscegenation bad for your kids?

Generally my stance is that people should date/mate who they want. Furthermore, I don’t personally care much about this question or the future survival of statistical clusters (races). Asserting that there are problems with cross-racial mating smacks of 19th century race thinking. However, it would not the first time people in the past got things right we now get wrong. It would be intellectually dishonest — and particularly inconsistent of me given my interest in the race and intelligence question — not to look into the topic just because it conflicts somewhat with my worldview.

Recently I become aware of the literature on the behavioral/mental problems of multirace persons, usually kids. This information came from white nationalists, so I was skeptical. While a lot of the stuff was anecdotal (see content on /r/hapas ), some was not. Consider:

Health and Behavior Risks of Adolescents with Mixed-Race Identity

Objectives. This study compared the health and risk status of adolescents who identify with 1 race with those identifying with more than 1 race.
Methods. Data are derived from self-reports of race, using the National Longitudinal Study of Adolescent Health (Add Health), which provides a large representative national sample of adolescents in grades 7 through 12. Respondents could report more than 1 race.
Results. Mixed-race adolescents showed higher risk when compared with single-race adolescents on general health questions, school experience, smoking and drinking, and other risk variables.
Conclusions. Adolescents who self-identify as more than 1 race are at higher health and behavior risks. The findings are compatible with interpreting the elevated risk of mixed race as associated with stress.


Are Multiracial Adolescents at Greater Risk? Comparisons of Rates, Patterns, and Correlates of Substance Use and Violence Between Monoracial and Multiracial Adolescents

Rates and patterns of substance use and violent behaviors among multiracial adolescents were examined and compared with 3 monoracial groups, European, African, and Asian Americans. The relationships between ethnic identity and the subjective experience of racial discrimination, substance use, and violent behavior were also examined. The authors found multiracial adolescents reporting higher rates of problem behaviors. Several significant relationships between ethnic identity and racial discrimination were found with these problem behaviors.



  • Two OK-sized samples: 1) Add Health, 2) some ad hoc sample of n=2300. Both USA.
  • Despite ok sample sizes, multirace children are fairly rare, and the effect sizes are not large, so we have insufficient precision to do lots of subgroup analyses. For this reason, both studies use (primarily) a simple way of doing the analysis: monorace vs. multirace, by aggregating all combinations of races into the latter. One could also have aggregated the monoraces, but that isn’t necessary and would obscure things. The sexes were also aggregated. Likely necessary here, but may obscure patterns.
  • The multirace kids do worse on most outcomes. The variation we see may just be due to chance.

A simple compositional model — and I always begin with simple models — predicts that if some group is a mix between A and B, then the mean value of some trait T should be intermediate between A and B’s trait values. This has repeatedly been found with things like IQ, income, and education, so I was expecting it to hold for mental health and various problem behaviors too (I am drug-friendly, so I don’t necessarily view the drug behavior as problematic, but for linguistic simplicity I went with this). And yet it doesn’t.

Since we collapsed the different multirace persons into a single category, the way one can maximize the predicted values is to assume all multirace persons are a mix of the two groups with the highest values, which here are presumably Blacks and American Indians. It’s an unrealistic mix. Most self-identified multirace persons are probably Asian-White and Black-White (Blacks of course already being a mixed group, which complicates matters).

So, supposing we have a phenomenon to explain, what hypotheses can we think of? I can think of 4:

Hybrid/outbreeding depression. This is the opposite of hybrid/outbreeding vigor/heterosis, and happens when there’s breeding between two populations that are so distantly related that genes start malfunctioning (incompatible genes end up in the same body). This can be seen for cross-species hybrids such as ligers (lion+tiger, MF). Some of these can look downright comical, and are often infertile. Note that the order matters: a liger is not the same as a tigon (tiger+lion). Why does order matter? Because the sex chromosomes come from different parents. Male liger will have lion Y chromosome, tiger X chromosome. Male tigon will have tiger Y, lion X. Wikipedia apparently used to have a list of these, but someone deleted it. Now there’s only the template left.

Outbreeding depression is not very probable for humans because humans are so closely related (Fst of .15 or so). However, it does give some interesting predictions, namely that the effect size for the multirace effect should be a function of the difference between the source groups, e.g. as measured in Fst. Thus, the multirace effect should be the largest for part-African mixes, and lower for e.g. White+EAsian (Chi, Jap, Kor), and even lower for White+Indian (as both are Caucasians).

In general, we would in fact generally expect human mixes to produce outbreeding vigor because this would tend to reduce the number of damaging recessive locuses. This only happens to the degree that a trait shows dominance effects, for which there is evidence that trait and GCA do (general cognitive ability). I don’t know about dominance effects for the problem behaviors mentioned here, but it seems very likely there are are some.

Self-selection. Perhaps people who engage in interracial mating (of the children resulting kind) are not a random subset of the population. In fact, we can be very sure that they are not, given that this is a basic and major life choice, and one’s that intensely debated.

People in general do not marry at random (but see), assortative mating is pervasive. Assortative mating however does not explain any such multirace effects. Suppose group A has trait value 10, and group B has trait value 20. By random mating, the persons who cross-group mate are also expected to have trait values of 10 and 20, respectively. However, if we add bidirectional assortative mating, then the cross-maters are expected to be closer to each either: perhaps the A’s will have a mean of 13, and the B’s 17, or maybe both are at 15. It depends on the specifics. Unidirectional assortative mating — if such exists — would tend to ‘push’ one group towards the other — perhaps the A’s would stay at 10 and select B’s with a mean T of 15 — but would never result in the mixed group having a trait value that doesn’t fall in between the two source groups.

However, outbreeding is evolutionarily weird and kind of a risky move, so we can expect this to select for persons with counter-normative and risk-taking traits — which seems related to the traits we see above. Depending on how large the effect size is, this can result in the multirace group having a trait mean that does not fall between the two source groups. If we imagine this self-selection to be +10, then the self-selected mating groups would have means of 20 and 30, and then their offspring would have a mean of 25 — 5 higher than B’s mean. In reality, there would also be regression towards the mean that depends on the true breeding familial effect size (vertical transfer + additive genetic (+ partial non-linear)). This means that the self-selection has to be quite large, at least larger than the gap between the two original groups, for it to explain the pattern we see alone.

The self-selection hypothesis is very likely correct to some degree. The question is to which degree. Luckily, it is possible to examine some aspects of the self-selection hypothesis with easily accessible datasets (e.g. NLSY). For instance, there is a prediction that we should see relationships between problem behavior and openness to engage in outbreeding in a cross-sectional sample. This is based on the auxiliary assumption that people who express such openness in surveys also act on it. This is perhaps not so true. To the degree it is not true, we expect the association to be weaker. If one can get one’s hands on data that include spouse information, this will help a lot. Familial data that have both parents and children is perfect.

Identity issues. In line with the r/hapas discussions and stuff like this, it seems likely some offspring would be prone to identity issues that may affect their behavior. Even if the mixing does not directly affect base-line neuroticism, it would constitute a life-long social stressor.

I note that other groups with identity issues — or what can plausibly be taken as such — also have higher rates of behavioral problems. This includes bisexuals and crime, a finding I semi-replicated in OKCupid data. Non-binary-sexuals have increased rates of mental illness.

Note that identity issues may be just a mediator for problems that have their origin further up the causal chain.

Social stigma. Perhaps the phenomenon is caused by the reactions of others to mixed race persons. Presumably for this to work, such reactions would have to be generally hostile, not generally positive. There are certainly places one can find where the climate to interbreeding are generally hostile, but there are also places one can find where they are not. Thus, by examining variation in this phenomenon across contexts it should be possible to see a moderation effect. In some versions of this hypothesis, mixed race persons would be predicted to be better than both source populations. Contexts to examine include: nations, sub-regions within nations, time, a person’s network. Maybe one could simply examine this question by using the GSS.

I don’t have time to look into this topic in detail now, but if someone is interested, I’m sure the OKCupid dataset would be of some use. I can guide you, but you will have to do the write-up of the study. I can do the coding if it’s a simple study.


Exercise for depression? Evaluating publication bias in Schuch et al 2016

Wrote a new re-analysis for another meta-analysis of exercise for depression. Found good evidence of publication bias again. Unclear what the effect size really is, but my posterior 95% range is something like -.05 to .50.

forest funnel


Amphetamine vs. methylphenidate for ADHD

Brief summary of thinking:

  • The war on drugs makes it difficult or impossible to study drugs on the ban list.
  • A drug’s presence on the ban list is primary due to psychoactive effects.
  • Medical researchers and companies are either too incompetent or too corrupt/incentivized to perform proper drug comparison studies.
  • The efficacy of drugs is positively related to their psychoactive effects, perhaps because the psychoactive effects are a mediating cause.
  • Amphetamine is more psychoactive than methylphenidate.
  • Amphetamine is often banned outright and when not, usually more controlled than methylphenidate.
  • Conclusion: amphetamine probably works better than methylphenidate.

So, does amphetamine work better than methylphenidate? I can only think of one relevant outcome, ADHD. For this, I could find one direct meta-analysis and one indirect.

Faraone et al 2010

Caveat: publication bias present



Our meta-analysis of ADHD efficacy outcomes found significant differences between amfetamine and methylphenidate products, even after correcting for study design features that might have confounded the results. Our analyses indicate that effect sizes for amfetamine products are statistically, albeit moderately, greater than those for methylphenidate. The robust effects of all stimulant medications can be seen in Figs. 1, 2, 3, which show that most measures of effect from all studies were statistically significant. These figures also show a good deal of variability among studies with overlapping confidence intervals between many methylphenidate and amfetamine studies, which is to be expected given the small differences in the mean SMDs between medication groups.

Authors do not provide a table with the SMDs (standardized mean differences) between the drugs, i.e. their write-up is poor. From skimming their forest plots, we can see that the SMD between drugs. For the largest comparison, “ADHD symptoms” effect size for amphetamine is about 1.00, while for methylphenidate it is about 0.70. Thus, the SMD between drugs is about .30 in favor of amphetamine.

Punja et al 2016

This study used a stronger method: a within person design:

An n-of-1 trial is a multiple crossover trial performed in a single participant, often with randomization and blinding. N-of-1 trials provide an opportunity to determine the effect of an intervention on an individual who may not fit the eligibility criteria for an RCT. Although they are primarily intended to evaluate therapeutic results in a single individual, preliminary data from systematic reviews indicate that the majority of published n-of-1 trials of health interventions comprise a series for the same condition-intervention pair (unpublished data).

Briefly, the design works by following persons over time and randomly switching them back and forth between treatments. A very nice method that could easily be scaled to all clinical use of treatments where we are uncertain about which drug is the most effective.

The write-up is almost as poor as the first study, but they do report the numeric values from the mean difference scores. They do not appear to be standardized, so the scores themselves are fairly uninterpretable. They do not provide standard deviations to convert these to standardized scores. I’ll make the assumption that these scales all have the same or at least similar SDs. They are based on subscales from two primary scales. They provide the mean difference scores for each outcome and each drug. Outcomes have two variants: parent and teacher rated. I used both. Only one datapoint was missing. With the assumption of commonish SD, we can calculate the drug mean differences for each outcome. I’ve extracted the scores into a spreadsheet.



  • I used both means and medians for robustness check.
  • I used case-wise complete data, i.e. I did not use the total score from parent datapoint for methylphenidate because it had no counterpart for amphetamine.
  • Both method variations favored amphetamine. The scale is not clear, but since the general effect size for both drugs versus placebo is about 3, we can scale the differences to this scale. If we use the SMDs from the Faraone study, we can estimate that amphetamine is .07 to .32 d better than methylphenidate. The Faraone study found about .30 d, so the results match up.