Physiognomy: a field ready for scientific revival

People keep asking me about the state of the art re. evidence for physiognomy, so here’s a brief review.

Phrenology used to be considered legit, and then eventually people realized it was all bogus. Since then, it is usually brought up an example of how science goes wrong in terms of stereotyping, and references to it are used to attack people who don’t agree with Aristotle that the brain is mainly used to cool blood — which is to say, to attack people who study brain size, shape etc. and relate this to differences in human psychology, chiefly intelligence. Some examples of such attacks can be seen here, here and here.

Aside from the political attacks, the skeptical reader might wonder, how does real phrenology look like? Actual phrenology, not strawman. Well, I came across a 1907 book that I shall take an illustrative and perhaps representative of popular phrenology.

Some screenshots of pages in the book.

I find these to be hilarious, and it strains the mind to think the author was serious. Perhaps he was selling a bullshit book. But maybe? The past was a different place. Bloodletting was popular for hundreds if not thousands of years but doesn’t work for much of anything (in fact is detrimental). Whatever the case, this is how an actual early 1900s phrenology-physiognomy book looks like.

Modern tests

One can generally split up the science in two parts. One relating features of the brain to psychological differences. The second is relating visible features to psychological differences. The first is now mainstream in science and one can find probably 1000s of papers in mainstream journals publishing papers on this. This development happened in spite of attempts by social justice scholars and Marxists, especially Steven Jay Gould, to mislead the public (reviews of his main attack book are very informative, see here and here). The second is still controversial, but there is growing evidence for it and I expect it to be quite mainstream 20 years from now. The general hypothesis — that facial characteristics relate to character — is quite sensible because we all make judgments of persons based on quite limited information, including pictures (Tinder, politicians on TV, people in bars and so on), making it a reasonable hypothesis that this practice has evolutionary origins because of adaptive value, which is to say, it is useful because it has some accuracy.

Some recent studies include:

We study, for the first time, automated inference on criminality based solely on still face images, which is free of any biases of subjective judgments of human observers.Via supervised machine learning, we build four classifiers(logistic regression, KNN, SVM, CNN) using facial images of 1856 real persons controlled for race, gender, age and facial expressions, nearly half of whom were convicted criminals, for discriminating between criminals and non-criminals. All four classifiers perform consistently well and empirically establish the validity of automated face-induced inference on criminality, despite the historical controversy surrounding this line of enquiry. Also, some discriminating structural features for predicting criminality have been found by machine learning. Above all, the most important discovery of this research is that criminal and non-criminal face images populate two quite distinctive manifolds. The variation among criminal faces is significantly greater than that of the non-criminal faces. The two manifolds consisting of criminal and non-criminal faces appear to be concentric, with the non-criminal manifold lying in the kernel with a smaller span, exhibiting a law of ”normality” for faces of non-criminals. In other words, the faces of general law-biding public have a greater degree of resemblance compared with the faces of criminals, or criminals have a higher degree of dissimilarity in facial appearance than non-criminals.

This paper went viral, and the authors were shamed into publishing an apology of sorts. Still, their introduction is informative:

In all cultures and all periods of recorded human history,people share the belief that the face alone suffices to reveal innate traits of a person. Aristotle in his famous work Prior Analytics asserted, ”It is possible to infer character from features, if it is granted that the body and the soul are changed together by the natural affections”. Psychologists have known, for as long as a millennium, the human tendency of inferring innate traits and social attributes (e.g., the trustworthiness, dominance) of a person from his/her facial appearance, and a robust consensus of individuals’ inferences . These are the facts found through numerous studies [3, 39, 5, 6, 10, 26, 27, 34, 32].

Some of the studies cited above are:

These are all pre-replication crisis papers by psychologists looking at evidence for humans being able to determine traits from faces. I didn’t read them closely but they seem to be the typical low power, multi-sample studies, so they are probably not very informative aside from establishing that one can get this kind of thing published and cited in mainstream journals. Of course, we know that anything humans can do by intuitive judgment can be done better by a machine given sufficient training data and the right algorithm. So, are there more recent computer studies that provide strong evidence?

The authors from before have a follow up paper (this time being a bit less blunt!):

This article is a sequel to our earlier work [25]. The main objective of our research is to explore the potential of supervised machine learning in face-induced social computing and cognition, riding on the momentum of much heralded successes of face processing, analysis and recognition on the tasks of biometric-based identification. We present a case study of automated statistical inference on sociopsychological perceptions of female faces controlled for race, attractiveness, age and nationality. Our empirical evidences point to the possibility of training machine learning algorithms, using example face images characterized by internet users, to predict perceptions of personality traits and demeanors.

Does it work?

But this study was just predicting rated attractiveness of women, so not really a psychological trait. It could however be quite useful for automating dating app usage.

What about sexual orientation? This one has obvious evolutionary relevance for mating purposes, so humans should be somewhat adept at it. There are several studies.

We show that faces contain much more information about sexual orientation than can be perceived or interpreted by the human brain. We used deep neural networks to extract features from 35,326 facial images. These features were entered into a logistic regression aimed at classifying sexual orientation. Given a single facial image, a classifier could correctly distinguish between gay and heterosexual men in 81% of cases, and in 71% of cases for women. Human judges achieved much lower accuracy: 61% for men and 54% for women. The accuracy of the algorithm increased to 91% and 83%, respectively, given five facial images per person. Facial features employed by the classifier included both fixed (e.g., nose shape) and transient facial features (e.g., grooming style). Consistent with the prenatal hormone theory of sexual orientation, gay men and women tended to have gender-atypical facial morphology, expression, and grooming styles. Prediction models aimed at gender alone allowed for detecting gay males with 57% accuracy and gay females with 58% accuracy. Those findings advance our understanding of the origins of sexual orientation and the limits of human perception. Additionally, given that companies and governments are increasingly using computer vision algorithms to detect people’s intimate traits, our findings expose a threat to the privacy and safety of gay men and women.

And there is a pretty close replication.

Recent research used machine learning methods to predict a person’s sexual orientation from their photograph (Wang and Kosinski, 2017). To verify this result, two of these models are replicated, one based on a deep neural network (DNN) and one on facial morphology (FM). Using a new dataset of 20,910 photographs from dating websites, the ability to predict sexual orientation is confirmed (DNN accuracy male 68%, female 77%, FM male 62%, female 72%). To investigate whether facial features such as brightness or predominant colours are predictive of sexual orientation, a new model based on highly blurred facial images was created. This model was also able to predict sexual orientation (male 63%, female 72%). The tested models are invariant to intentional changes to a subject’s makeup, eyewear, facial hair and head pose (angle that the photograph is taken at). It is shown that the head pose is not correlated with sexual orientation. While demonstrating that dating profile images carry rich information about sexual orientation these results leave open the question of how much is determined by facial morphology and how much by differences in grooming, presentation and lifestyle. The advent of new technology that is able to detect sexual orientation in this way may have serious implications for the privacy and safety of gay men and women.

So, like human observers, machines can predict sexual orientation from images.

Moving on to other traits, what about autism?

  • Tan, D. W., Gilani, S. Z., Maybery, M. T., Mian, A., Hunt, A., Walters, M., & Whitehouse, A. J. (2017). Hypermasculinised facial morphology in boys and girls with autism spectrum disorder and its association with symptomatology. Scientific reports, 7(1), 9348.

Elevated prenatal testosterone exposure has been associated with Autism Spectrum Disorder (ASD) and facial masculinity. By employing three-dimensional (3D) photogrammetry, the current study investigated whether prepubescent boys and girls with ASD present increased facial masculinity compared to typically-developing controls. There were two phases to this research. 3D facial images were obtained from a normative sample of 48 boys and 53 girls (3.01–12.44 years old) to determine typical facial masculinity/femininity. The sexually dimorphic features were used to create a continuous ‘gender score’, indexing degree of facial masculinity. Gender scores based on 3D facial images were then compared for 54 autistic and 54 control boys (3.01–12.52 years old), and also for 20 autistic and 60 control girls (4.24–11.78 years). For each sex, increased facial masculinity was observed in the ASD group relative to control group. Further analyses revealed that increased facial masculinity in the ASD group correlated with more social-communication difficulties based on the Social Affect score derived from the Autism Diagnostic Observation Scale-Generic (ADOS-G). There was no association between facial masculinity and the derived Restricted and Repetitive Behaviours score. This is the first study demonstrating facial hypermasculinisation in ASD and its relationship to social-communication difficulties in prepubescent children.

So in plain English: they took photos of non-autistic kids, and trained an algorithm to classify male and female faces. Then they applied this to another sample of autistic kids, and the results you see above: the autistic kids are masculinized compared to their sex norms. The autistic girls are almost halfway towards the normal male distribution! Having dated a number of autistic girls, I was not at all surprised by these results (they also have noticeably more arm hair and deeper voices).

There is even a recent review of face to trait studies:

  • Jia, X., Tian, W., & Fan, Y. (2018, November). Physiognomy in New Era: A Survey of Automatic Personality Prediction Based on Facial Image. In International Conference on Internet of Things as a Service (pp. 12-29). Springer, Cham.

At present, personality computing technology facilitates the understanding, prediction, and management of human behavior. With the increasing importance of faces in personal daily assessments, establishing a relationship between facial morphological features and personality traits is a major breakthrough in personality computing technology. This paper is a survey of such technology of automatic personality prediction based on face and it aims at providing not only a solid knowledge base about the state-of-the-art in automatic personality prediction, but also to provide a conceptual model of automatic personality prediction, based on the literature. In addition, the analysis of the prediction results of the existing researches is emphasized, and there are still problems in the field, such as lack of information on research data, single age group of the sample population, incomplete design characteristics of the artificial design etc., and the potential applications and development directions are determined.

There is also newer research using human subjects. For instance, someone wrote a dissertation on it at Cornell University no less:

Can participants accurately determine whether someone will later become a criminal based only on the person’s high school yearbook photo? This project builds on previous research which has found participants are capable of accurately and reliably assessing personality characteristics—like trustworthiness and dominance—based only on a photograph. This paper discusses a series of studies which examine whether participants are also capable of making accurate predictions of criminality by utilizing high school yearbook photographs of men with later criminal records. In Study 1, participants were able to make accurate predictions of future criminality from high school yearbook photographs. In Study 2, the results from the previous study were replicated and confidence in criminality attributions was found to predict accuracy. In Study 3, participants were less accurate when judging photographs of Black students compared to White students, suggesting cross-race bias.Altogether, these studies demonstrated that participants have accurate stereotypes about what a person with a criminal record looks like. These stereotypes may create a self-fulfilling prophecy in which people who look criminal are treated like criminals and thus end up with criminal records. This theory was tested in Study 4 in which participants were asked to judge guilt based on mugshots of exonerated men and true criminals. Overall, this serious of studies demonstrated that participants can make accurate and consistent predictions of future criminality based only on facial appearance.

Every day, people make quick, spontaneous and automatic appearance-based inferences of others. This is particularly true for social attributes, such as intelligence or attractiveness, but also aggression and criminality. There are also indications that certain personality traits, such as the dark traits (i.e. Machiavellianism, narcissism, psychopathy, sadism), influence the degree of accuracy of appearance-based inferences, even though not all authors agree to this. Therefore, this study aims to investigate whether there are interpersonal advantages related to the dark traits when assessing someone’s criminality. For that purpose, an on-line study was conducted on a convenience sample of 676 adult females, whose task was to assess whether a certain person was a criminal or not based on their photograph. The results have shown that narcissism and Machiavellianism were associated with a greater tendency of indicating that someone is a criminal, reflecting an underlying negative bias that the individuals high on these traits hold about people in general.

What about the weird cranium bumps stuff?

There is already a modern study of this.

Phrenology was a nineteenth century endeavour to link personality traits with scalp morphology, which has been both influential and fiercely criticised, not least because of the assumption that scalp morphology can be informative of underlying brain function. Here we test the idea empirically rather than dismissing it out of hand. Whereas nineteenth century phrenologists had access to coarse measurement tools (digital technology referring then to fingers), we were able to re-examine phrenology using 21st century methods and thousands of subjects drawn from the largest neuroimaging study to date. High-quality structural MRI was used to quantify local scalp curvature. The resulting curvature statistics were compared against lifestyle measures acquired from the same cohort of subjects, being careful to match a subset of lifestyle measures to phrenological ideas of brain organisation, in an effort to evoke the character of Victorian times. The results represent the most rigorous evaluation of phrenological claims to date. [sample size is 5.7k people from UKBB)

So, while this is only a single study, we can probably be confident that bumps on the scalp aren’t terribly informative about personality, except in gross cases of brain injury which sometimes causes personality changes.

So, all in all, modern science confirms that human psychological differences relate to visual appearance, including variation in facial features. Humans pick up on these automatically and use them in their social judgments to increase the accuracy of their social judgments in the same way they incorporate group averages (stereotypes). No scientist should be very surprised by these findings.

On a personal note: I’ve been meaning to do some of my own research on this using data scraped from various dating sites and applications. OKCupid data is especially good for this given the rich personality data, but the site is quite bad and not very popular anymore. A big shame! Instead, one will have to rely on data from Tinder, Hinge, Coffee Meets Bagel etc. These datasets don’t generally provide information on the more interesting traits such as sexual paraphilias/kinks (who likes anal sex? what about foot fetish?), criminality (aside from the ancestry and sex link), and detailed political beliefs (what does the typical libertarian look like? Aside from the coffee salon demographics). I haven’t had the time to do this research due to being busy doing work on the genomics of race differences. Get in contact with me if you want to collaborate. I have a lot of data but little experience in these kinds of algorithms.


Leave a Reply