Emil O. W. Kirkegaard
We present and analyze data from a dataset of 2358 Danish first names and socioeconomic outcomes not previously made available to the public (Navnehjulet, the Name Wheel). We visualize the data and show that there is a general socioeconomic factor with indicator loadings in the expected directions (positive: income, owning your own place; negative: having a criminal conviction, being without a job). This result holds after controlling for age and for each gender alone. It also holds when analyzing the data in age bins. The factor loading of being married depends on analysis method, so it is more difficult to interpret.
A pseudofertility is calculated based on the population size for the names for the years 2012 and 2015. This value is negatively correlated with the S factor score r = -.35 [95CI: -.39; -.31], but the relationship seems to be somewhat non-linear and there is an upward trend at the very high end of the S factor. The relationship is strongly driven by relatively uncommon names who have high pseudofertility and low to very low S scores. The n-weighted correlation is -.21 [95CI: -.25; -.17]. This dysgenic pseudofertility seems to be mostly driven by Arabic and African names.
All data and R code is freely available.
Key words: names, Denmark, Danish, social status, crime, income, education, age, scraping, S factor, general socioeconomic factor