Predicting immigrant performance: Does inbreeding have incremental validity over IQ and Islam?

@hbdchick if u can produce an inbreeding measure by country, we can test it. or u can, the data are public. :)

— Emil O W Kirkegaard (@KirkegaardEmil) January 14, 2015

So, she came up with:

@KirkegaardEmil good (weighted) coefficients of inbreeding by country at the back of this paper: http://t.co/FjBZCHufqi but, again…

— hbd chick 👀 (@hbdchick) January 14, 2015

So I decided to try it out, since I’m taking a break from reading Lilienfeld which I had been doing that for 5 hours straight or so.

So the question is whether inbreeding measures have incremental validity over IQ and Islam, which I have previously used to examine immigrant performance in a number of studies.

So, to get the data into R, I OCR’d the PDF in Abbyy FineReader since this program allows for easy copying of table data by row or column. I only wanted column 1-2 and didn’t want to deal with the hassle of importing it with spreadsheet problems (which need a consistent separator, e.g. comma or space). Then I merged it with the megadataset to create a new version, 2.0d.

Then I created a subset of the data with variables of interest, and renamed them (otherwise results would be unwieldy). Intercorrelations are:

	row.names	Cousin%	CoefInbreed	IQ	Islam	S.in.DK
1	Cousin%	1.00	0.52	-0.59	0.78	-0.76
2	CoefInbreed	0.52	1.00	-0.28	0.40	-0.55
3	IQ	-0.59	-0.28	1.00	-0.27	0.54
4	Islam	0.78	0.40	-0.27	1.00	-0.71
5	S.in.DK	-0.76	-0.55	0.54	-0.71	1.00

Spearman’ correlations, which are probably better due to the non-normal data:

	row.names	Cousin%	CoefInbreed	IQ	Islam	S.in.DK
1	Cousin%	1.00	0.91	-0.63	0.67	-0.73
2	CoefInbreed	0.91	1.00	-0.55	0.61	-0.76
3	IQ	-0.63	-0.55	1.00	-0.23	0.72
4	Islam	0.67	0.61	-0.23	1.00	-0.61
5	S.in.DK	-0.73	-0.76	0.72	-0.61	1.00

The fairly high correlations of inbreeding measures with IQ and Islam mean that their contribution will likely be modest as incremental validity.

However, let’s try modeling them. I create 7 models of interest and compile the primary measure of interest from them, R² adjusted, into an object. Looks like this:

	row.names	R2 adj.
1	S.in.DK ~ IQ+Islam	0.5472850
2	S.in.DK ~ IQ+Islam+CousinPercent	0.6701305
3	S.in.DK ~ IQ+Islam+CoefInbreed	0.7489312
4	S.in.DK ~ Islam+CousinPercent	0.6776841
5	S.in.DK ~ Islam+CoefInbreed	0.7438711
6	S.in.DK ~ IQ+CousinPercent	0.5486674
7	S.in.DK ~ IQ+CoefInbreed	0.4979552

So we see that either of them adds a fair amount of incremental validity to the base model (line 1 vs. 2-3). They are in fact better than IQ if one substitutes them in (1 vs. 4-5). They can also substitute for Islam, but only with about the same predictive power (1 vs 6-7).

Replication for Norway

Replication for science is important. Let’s try Norwegian data. The Finnish and Dutch data are well-suited for this (too few immigrant groups, few outcome variables i.e. only crime)

Pearson intercorrelations:

	row.names	CousinPercent	CoefInbreed	IQ	Islam	S.in.NO
1	CousinPercent	1.00	0.52	-0.59	0.78	-0.78
2	CoefInbreed	0.52	1.00	-0.28	0.40	-0.46
3	IQ	-0.59	-0.28	1.00	-0.27	0.60
4	Islam	0.78	0.40	-0.27	1.00	-0.72
5	S.in.NO	-0.78	-0.46	0.60	-0.72	1.00

Spearman:

	row.names	CousinPercent	CoefInbreed	IQ	Islam	S.in.NO
1	CousinPercent	1.00	0.91	-0.63	0.67	-0.77
2	CoefInbreed	0.91	1.00	-0.55	0.61	-0.71
3	IQ	-0.63	-0.55	1.00	-0.23	0.75
4	Islam	0.67	0.61	-0.23	1.00	-0.47
5	S.in.NO	-0.77	-0.71	0.75	-0.47	1.00

These look fairly similar to Denmark.

And the regression results:

	row.names	R2 adj.
1	S.in.NO ~ IQ+Islam	0.5899682
2	S.in.NO ~ IQ+Islam+CousinPercent	0.7053999
3	S.in.NO ~ IQ+Islam+CoefInbreed	0.7077162
4	S.in.NO ~ Islam+CousinPercent	0.6826272
5	S.in.NO ~ Islam+CoefInbreed	0.6222364
6	S.in.NO ~ IQ+CousinPercent	0.6080922
7	S.in.NO ~ IQ+CoefInbreed	0.5460777

Fairly similar too. If added, they have incremental validity (line 1 vs. 2-3). They perform better than IQ if substituted but not as much as in the Danish data (1 vs. 4-5). They can also substitute for Islam (1 vs. 6-7).

How to interpret?

Since inbreeding does not seem to have any direct influence on behavior that is reflected in the S factor, it is not so easy to interpret these findings. Inbreeding leads to various health problems and lower g in offspring, the latter which may have some effect. However, presumably, national IQs already reflect the lowered IQ from inbreeding, so there should be no additional effect there beyond national IQs. Perhaps inbreeding results in other psychological problems that are relevant.

Another idea is that inbreeding rates reflect non-g psychological traits that are relevant to adapting to life in Denmark. Perhaps it is a useful measure of clanishness, would be reflected in hostility towards integration in Danish society (such as getting an education, or lack of sympathy/antipathy towards ethnic Danes and resulting higher crime rates against them), which would be reflected in the S factor.

The lack of relatively well established causal routes for interpreting the finding makes me somewhat cautious about how to interpret this.

##Code for mergining cousin marriage+inbreeding data with megadataset
inbreed = read.table("clipboard", sep="\t",header=TRUE, row.names=1) #load data from clipboard
source("merger.R") #load mega functions
mega20d = read.mega("Megadataset_v2.0d.csv") #load latest megadataset
names = as.abbrev(rownames(inbreed)) #get abbreviated names
rownames(inbreed) = names #set them as rownames

#merge and save
mega20e = merge.datasets(mega20d,inbreed,1) #merge to create v. 2.0e
write.mega(mega20e,"Megadataset_v2.0e.csv") #save it

#select subset of interesting data
dk.data = subset(mega20e, selec=c("Weighted.mean.consanguineous.percentage.HobenEtAl2010",
                                  "Weighted.mean.coefficient.of.inbreeding.HobenEtAl2010",
                                  "LV2012estimatedIQ",
                                  "IslamPewResearch2010",
                                  "S.factor.in.Denmark.Kirkegaard2014"))
colnames(dk.data) = c("CousinPercent","CoefInbreed","IQ","Islam","S.in.DK") #shorter var names
rcorr = rcorr(as.matrix(dk.data)) #correlation object
View(round(rcorr$r,2)) #view correlations, round to 2
rcorr.S = rcorr(as.matrix(dk.data),type = "spearman") #spearman correlation object
View(round(rcorr.S$r,2)) #view correlations, round to 2

#Multiple regression
library(QuantPsyc) #for beta coef
results = as.data.frame(matrix(data = NA, nrow=0, ncol = 1)) #empty matrix for results
colnames(results) = "R2 adj."
models = c("S.in.DK ~ IQ+Islam", #base model,
           "S.in.DK ~ IQ+Islam+CousinPercent", #1. inbreeding var
           "S.in.DK ~ IQ+Islam+CoefInbreed", #2. inbreeding var
           "S.in.DK ~ Islam+CousinPercent", #without IQ
           "S.in.DK ~ Islam+CoefInbreed", #without IQ
           "S.in.DK ~ IQ+CousinPercent", #without Islam
           "S.in.DK ~ IQ+CoefInbreed") #without Islam

for (model in models){ #run all the models
  fit.model = lm(model, dk.data) #fit model
  sum.stats = summary(fit.model) #summary stats object
  summary(fit.model) #summary stats
  lm.beta(fit.model) #standardized betas
  results[model,] = sum.stats$adj.r.squared #add result to results object
}
View(results) #view results

##Let's try Norway too
no.data = subset(mega20e, selec=c("Weighted.mean.consanguineous.percentage.HobenEtAl2010",
                                  "Weighted.mean.coefficient.of.inbreeding.HobenEtAl2010",
                                  "LV2012estimatedIQ",
                                  "IslamPewResearch2010",
                                  "S.factor.in.Norway.Kirkegaard2014"))

colnames(no.data) = c("CousinPercent","CoefInbreed","IQ","Islam","S.in.NO") #shorter var names
rcorr = rcorr(as.matrix(no.data)) #correlation object
View(round(rcorr$r,2)) #view correlations, round to 2
rcorr.S = rcorr(as.matrix(no.data),type = "spearman") #spearman correlation object
View(round(rcorr.S$r,2)) #view correlations, round to 2

results = as.data.frame(matrix(data = NA, nrow=0, ncol = 1)) #empty matrix for results
colnames(results) = "R2 adj."
models = c("S.in.NO ~ IQ+Islam", #base model,
           "S.in.NO ~ IQ+Islam+CousinPercent", #1. inbreeding var
           "S.in.NO ~ IQ+Islam+CoefInbreed", #2. inbreeding var
           "S.in.NO ~ Islam+CousinPercent", #without IQ
           "S.in.NO ~ Islam+CoefInbreed", #without IQ
           "S.in.NO ~ IQ+CousinPercent", #without Islam
           "S.in.NO ~ IQ+CoefInbreed") #without Islam

for (model in models){ #run all the models
  fit.model = lm(model, no.data) #fit model
  sum.stats = summary(fit.model) #summary stats object
  summary(fit.model) #summary stats
  lm.beta(fit.model) #standardized betas
  results[model,] = sum.stats$adj.r.squared #add result to results object
}
View(results) #view results

You Might Also Like

Bryan Caplan’s Open Borders book: considerations

Can the X chromosome explain greater male variability? Seems not

How many people can solve this kind of problem?