So, she came up with:
So I decided to try it out, since I’m taking a break from reading Lilienfeld which I had been doing that for 5 hours straight or so.
So the question is whether inbreeding measures have incremental validity over IQ and Islam, which I have previously used to examine immigrant performance in a number of studies.
So, to get the data into R, I OCR’d the PDF in Abbyy FineReader since this program allows for easy copying of table data by row or column. I only wanted column 1-2 and didn’t want to deal with the hassle of importing it with spreadsheet problems (which need a consistent separator, e.g. comma or space). Then I merged it with the megadataset to create a new version, 2.0d.
Then I created a subset of the data with variables of interest, and renamed them (otherwise results would be unwieldy). Intercorrelations are:
row.names | Cousin% | CoefInbreed | IQ | Islam | S.in.DK | |
---|---|---|---|---|---|---|
1 | Cousin% | 1.00 | 0.52 | -0.59 | 0.78 | -0.76 |
2 | CoefInbreed | 0.52 | 1.00 | -0.28 | 0.40 | -0.55 |
3 | IQ | -0.59 | -0.28 | 1.00 | -0.27 | 0.54 |
4 | Islam | 0.78 | 0.40 | -0.27 | 1.00 | -0.71 |
5 | S.in.DK | -0.76 | -0.55 | 0.54 | -0.71 | 1.00 |
Spearman’ correlations, which are probably better due to the non-normal data:
row.names | Cousin% | CoefInbreed | IQ | Islam | S.in.DK | |
---|---|---|---|---|---|---|
1 | Cousin% | 1.00 | 0.91 | -0.63 | 0.67 | -0.73 |
2 | CoefInbreed | 0.91 | 1.00 | -0.55 | 0.61 | -0.76 |
3 | IQ | -0.63 | -0.55 | 1.00 | -0.23 | 0.72 |
4 | Islam | 0.67 | 0.61 | -0.23 | 1.00 | -0.61 |
5 | S.in.DK | -0.73 | -0.76 | 0.72 | -0.61 | 1.00 |
The fairly high correlations of inbreeding measures with IQ and Islam mean that their contribution will likely be modest as incremental validity.
However, let’s try modeling them. I create 7 models of interest and compile the primary measure of interest from them, R2 adjusted, into an object. Looks like this:
row.names | R2 adj. | |
---|---|---|
1 | S.in.DK ~ IQ+Islam | 0.5472850 |
2 | S.in.DK ~ IQ+Islam+CousinPercent | 0.6701305 |
3 | S.in.DK ~ IQ+Islam+CoefInbreed | 0.7489312 |
4 | S.in.DK ~ Islam+CousinPercent | 0.6776841 |
5 | S.in.DK ~ Islam+CoefInbreed | 0.7438711 |
6 | S.in.DK ~ IQ+CousinPercent | 0.5486674 |
7 | S.in.DK ~ IQ+CoefInbreed | 0.4979552 |
So we see that either of them adds a fair amount of incremental validity to the base model (line 1 vs. 2-3). They are in fact better than IQ if one substitutes them in (1 vs. 4-5). They can also substitute for Islam, but only with about the same predictive power (1 vs 6-7).
Replication for Norway
Replication for science is important. Let’s try Norwegian data. The Finnish and Dutch data are well-suited for this (too few immigrant groups, few outcome variables i.e. only crime)
Pearson intercorrelations:
row.names | CousinPercent | CoefInbreed | IQ | Islam | S.in.NO | |
---|---|---|---|---|---|---|
1 | CousinPercent | 1.00 | 0.52 | -0.59 | 0.78 | -0.78 |
2 | CoefInbreed | 0.52 | 1.00 | -0.28 | 0.40 | -0.46 |
3 | IQ | -0.59 | -0.28 | 1.00 | -0.27 | 0.60 |
4 | Islam | 0.78 | 0.40 | -0.27 | 1.00 | -0.72 |
5 | S.in.NO | -0.78 | -0.46 | 0.60 | -0.72 | 1.00 |
Spearman:
row.names | CousinPercent | CoefInbreed | IQ | Islam | S.in.NO | |
---|---|---|---|---|---|---|
1 | CousinPercent | 1.00 | 0.91 | -0.63 | 0.67 | -0.77 |
2 | CoefInbreed | 0.91 | 1.00 | -0.55 | 0.61 | -0.71 |
3 | IQ | -0.63 | -0.55 | 1.00 | -0.23 | 0.75 |
4 | Islam | 0.67 | 0.61 | -0.23 | 1.00 | -0.47 |
5 | S.in.NO | -0.77 | -0.71 | 0.75 | -0.47 | 1.00 |
These look fairly similar to Denmark.
And the regression results:
row.names | R2 adj. | |
---|---|---|
1 | S.in.NO ~ IQ+Islam | 0.5899682 |
2 | S.in.NO ~ IQ+Islam+CousinPercent | 0.7053999 |
3 | S.in.NO ~ IQ+Islam+CoefInbreed | 0.7077162 |
4 | S.in.NO ~ Islam+CousinPercent | 0.6826272 |
5 | S.in.NO ~ Islam+CoefInbreed | 0.6222364 |
6 | S.in.NO ~ IQ+CousinPercent | 0.6080922 |
7 | S.in.NO ~ IQ+CoefInbreed | 0.5460777 |
Fairly similar too. If added, they have incremental validity (line 1 vs. 2-3). They perform better than IQ if substituted but not as much as in the Danish data (1 vs. 4-5). They can also substitute for Islam (1 vs. 6-7).
How to interpret?
Since inbreeding does not seem to have any direct influence on behavior that is reflected in the S factor, it is not so easy to interpret these findings. Inbreeding leads to various health problems and lower g in offspring, the latter which may have some effect. However, presumably, national IQs already reflect the lowered IQ from inbreeding, so there should be no additional effect there beyond national IQs. Perhaps inbreeding results in other psychological problems that are relevant.
Another idea is that inbreeding rates reflect non-g psychological traits that are relevant to adapting to life in Denmark. Perhaps it is a useful measure of clanishness, would be reflected in hostility towards integration in Danish society (such as getting an education, or lack of sympathy/antipathy towards ethnic Danes and resulting higher crime rates against them), which would be reflected in the S factor.
The lack of relatively well established causal routes for interpreting the finding makes me somewhat cautious about how to interpret this.
##Code for mergining cousin marriage+inbreeding data with megadataset inbreed = read.table("clipboard", sep="\t",header=TRUE, row.names=1) #load data from clipboard source("merger.R") #load mega functions mega20d = read.mega("Megadataset_v2.0d.csv") #load latest megadataset names = as.abbrev(rownames(inbreed)) #get abbreviated names rownames(inbreed) = names #set them as rownames #merge and save mega20e = merge.datasets(mega20d,inbreed,1) #merge to create v. 2.0e write.mega(mega20e,"Megadataset_v2.0e.csv") #save it #select subset of interesting data dk.data = subset(mega20e, selec=c("Weighted.mean.consanguineous.percentage.HobenEtAl2010", "Weighted.mean.coefficient.of.inbreeding.HobenEtAl2010", "LV2012estimatedIQ", "IslamPewResearch2010", "S.factor.in.Denmark.Kirkegaard2014")) colnames(dk.data) = c("CousinPercent","CoefInbreed","IQ","Islam","S.in.DK") #shorter var names rcorr = rcorr(as.matrix(dk.data)) #correlation object View(round(rcorr$r,2)) #view correlations, round to 2 rcorr.S = rcorr(as.matrix(dk.data),type = "spearman") #spearman correlation object View(round(rcorr.S$r,2)) #view correlations, round to 2 #Multiple regression library(QuantPsyc) #for beta coef results = as.data.frame(matrix(data = NA, nrow=0, ncol = 1)) #empty matrix for results colnames(results) = "R2 adj." models = c("S.in.DK ~ IQ+Islam", #base model, "S.in.DK ~ IQ+Islam+CousinPercent", #1. inbreeding var "S.in.DK ~ IQ+Islam+CoefInbreed", #2. inbreeding var "S.in.DK ~ Islam+CousinPercent", #without IQ "S.in.DK ~ Islam+CoefInbreed", #without IQ "S.in.DK ~ IQ+CousinPercent", #without Islam "S.in.DK ~ IQ+CoefInbreed") #without Islam for (model in models){ #run all the models fit.model = lm(model, dk.data) #fit model sum.stats = summary(fit.model) #summary stats object summary(fit.model) #summary stats lm.beta(fit.model) #standardized betas results[model,] = sum.stats$adj.r.squared #add result to results object } View(results) #view results ##Let's try Norway too no.data = subset(mega20e, selec=c("Weighted.mean.consanguineous.percentage.HobenEtAl2010", "Weighted.mean.coefficient.of.inbreeding.HobenEtAl2010", "LV2012estimatedIQ", "IslamPewResearch2010", "S.factor.in.Norway.Kirkegaard2014")) colnames(no.data) = c("CousinPercent","CoefInbreed","IQ","Islam","S.in.NO") #shorter var names rcorr = rcorr(as.matrix(no.data)) #correlation object View(round(rcorr$r,2)) #view correlations, round to 2 rcorr.S = rcorr(as.matrix(no.data),type = "spearman") #spearman correlation object View(round(rcorr.S$r,2)) #view correlations, round to 2 results = as.data.frame(matrix(data = NA, nrow=0, ncol = 1)) #empty matrix for results colnames(results) = "R2 adj." models = c("S.in.NO ~ IQ+Islam", #base model, "S.in.NO ~ IQ+Islam+CousinPercent", #1. inbreeding var "S.in.NO ~ IQ+Islam+CoefInbreed", #2. inbreeding var "S.in.NO ~ Islam+CousinPercent", #without IQ "S.in.NO ~ Islam+CoefInbreed", #without IQ "S.in.NO ~ IQ+CousinPercent", #without Islam "S.in.NO ~ IQ+CoefInbreed") #without Islam for (model in models){ #run all the models fit.model = lm(model, no.data) #fit model sum.stats = summary(fit.model) #summary stats object summary(fit.model) #summary stats lm.beta(fit.model) #standardized betas results[model,] = sum.stats$adj.r.squared #add result to results object } View(results) #view results