Personal freedom and cognitive ability: OKCupid dataset replication

Noah Carl has been investigating the relationship between cognitive ability and political opinions aside from the usual confused 1-axis left-right model. Specifically, looking at the economic freedom and personal freedom axes (á la this test). He did this in two datasets so far, covering the UK and the US:

OKCupid does not have that many questions on economic freedom (but there are some), but it has a lot of questions on personal freedom. I identified 18 in my search of which 17 had a sufficient sample size.

Questions

The questions are the following:

For each question, the order was changed so that later options were more in favor of freedom. If necessary, some options were removed (recoded to NA), such as those refusing to say how they would vote (for assisted suicide/q45168). In the case of child limits/q49714, more freedom was coded as anyone not answering “Yes”. And so on for other questions.

Correlations

The latent correlations (estimated Pearson correlations if the variables had been measured as normally distributed continuous variables) were:

	CA	flag burning	freedom of religion	prostitution	child limits	child test	gay marriage	cigarettes	smoking bars	cannabis	illegal drugs	illegal drugs2	illegal drugs religious	weapons	motorcycles	seatbelts	mandatory voting	assisted suicide
CA	1.00	0.40	0.12	0.28	0.06	0.14	0.35	0.12	-0.04	0.24	0.20	0.23	0.02	0.05	-0.11	0.12	0.01	0.31
flag burning	0.40	1.00	-0.04	0.43	0.08	0.12	0.43	0.16	-0.03	0.46	0.45	0.44	0.14	0.16	-0.10	0.32	-0.16	0.40
freedom of religion	0.12	-0.04	1.00	0.12	0.19	0.05	0.00	0.04	0.08	-0.06	0.01	0.07	0.10	0.04	0.21	-0.03	-0.06	0.02
prostitution	0.28	0.43	0.12	1.00	0.02	0.01	0.46	0.25	0.25	0.56	0.50	0.49	0.18	-0.12	-0.06	0.36	0.03	0.58
child limits	0.06	0.08	0.19	0.02	1.00	0.59	-0.19	0.14	-0.07	0.06	0.05	-0.13	0.00	-0.06	0.11	0.01	0.12	-0.21
child test	0.14	0.12	0.05	0.01	0.59	1.00	-0.22	0.12	-0.07	-0.05	0.15	-0.05	-0.08	-0.01	0.19	0.09	0.13	-0.24
gay marriage	0.35	0.43	0.00	0.46	-0.19	-0.22	1.00	0.08	-0.14	0.58	0.31	0.47	0.46	0.13	-0.62	-0.06	0.00	0.73
cigarettes	0.12	0.16	0.04	0.25	0.14	0.12	0.08	1.00	0.55	0.41	0.41	0.38	0.40	-0.14	0.11	0.27	0.10	0.32
smoking bars	-0.04	-0.03	0.08	0.25	-0.07	-0.07	-0.14	0.55	1.00	0.13	0.41	0.31	0.09	-0.27	0.55	0.48	0.04	0.16
cannabis	0.24	0.46	-0.06	0.56	0.06	-0.05	0.58	0.41	0.13	1.00	0.78	0.78	0.64	-0.13	-0.50	0.27	0.17	0.68
illegal drugs	0.20	0.45	0.01	0.50	0.05	0.15	0.31	0.41	0.41	0.78	1.00	0.54	0.38	-0.08	-0.04	0.34	0.05	0.47
illegal drugs2	0.23	0.44	0.07	0.49	-0.13	-0.05	0.47	0.38	0.31	0.78	0.54	1.00	0.49	-0.12	-0.19	0.21	-0.04	0.52
illegal drugs religious	0.02	0.14	0.10	0.18	0.00	-0.08	0.46	0.40	0.09	0.64	0.38	0.49	1.00	-0.19	-0.50	0.14	0.05	0.45
weapons	0.05	0.16	0.04	-0.12	-0.06	-0.01	0.13	-0.14	-0.27	-0.13	-0.08	-0.12	-0.19	1.00	-0.07	-0.39	-0.15	-0.03
motorcycles	-0.11	-0.10	0.21	-0.06	0.11	0.19	-0.62	0.11	0.55	-0.50	-0.04	-0.19	-0.50	-0.07	1.00	0.11	-0.45	-0.50
seatbelts	0.12	0.32	-0.03	0.36	0.01	0.09	-0.06	0.27	0.48	0.27	0.34	0.21	0.14	-0.39	0.11	1.00	0.34	0.28
mandatory voting	0.01	-0.16	-0.06	0.03	0.12	0.13	0.00	0.10	0.04	0.17	0.05	-0.04	0.05	-0.15	-0.45	0.34	1.00	0.18
assisted suicide	0.31	0.40	0.02	0.58	-0.21	-0.24	0.73	0.32	0.16	0.68	0.47	0.52	0.45	-0.03	-0.50	0.28	0.18	1.00

CA = cognitive ability.

The distribution of correlations among the personal freedom measures was this:

So, it leaned towards positive but not overwhelmingly so. People are not that consistent on the personal freedom axis. The negative correlations mostly come from the motorcycle question: apparently many people who want to ban motorcycles also support e.g. gay marriage (r = -.62).

The mean intercorrelations by item were:

Variable	Mean cor with others
flag burning	0.20
freedom of religion	0.05
prostitution	0.25
child limits	0.04
child test	0.05
gay marriage	0.15
cigarettes	0.23
smoking bars	0.15
cannabis	0.30
illegal drugs	0.29
illegal drugs2	0.26
illegal drugs religious	0.17
weapons	-0.09
motorcycles	-0.11
seatbelts	0.17
mandatory voting	0.02
assisted suicide	0.24

Item response theory factor analysis

Altho some (mostly te Nijenhuis and his co-authors, and previously myself) have been analyzing item-level data using classical test theory measures, this approach is inappropriate because Pearson correlations are influenced by the difficulty of the items (proportion who gets them right).

The proper method is to use item response theory factor analysis (I use the version in the psych package, irt.fa). The item plot was this:

So, we see that many items were not very good measures (Y axis = discrimination ≈ factor loading, but not standardized to max of 1) of the assumed single underlying factor. This is in part because the factor structure is more complex than a single factor. Most items were too far on the left side, meaning that they could not distinguish well between the people on the right side (various sorts of libertarians, presumably). One would have to include more extreme questions, such as perhaps getting rid of FDA approval for drugs (the hardcore libertarians will say that the market solves this too, assuming perfect rationality…).

Still, suppose we ignore the factor structure problem and calculate factor scores anyway, then they look like this:

And again we see the problem: both ceiling and floor effects. Some people wanted to ban everything (left end) and many wanted to ban nothing (right end). One would have to introduce more questions to remove these effects, if that is even possible. Alternatively, some of the scores may be due to missing data for these persons. Most persons did not have data for all questions, so the scoring function tried to estimate scores the best it could from the available data.

Still, we can correlate the scores with cognitive ability (based on up to 14 items, see the OKCupid dataset release paper), and we get this:

There is indeed a positive correlation of .24 (N≈50k, but many cases are estimated with missing data). Obviously, the distribution is not normal, so Pearson correlation is a somewhat biased measure. Spearman’s correlation gave the same result, however. The strength of the relationship is around what previous studies found (.20 to .30).

The second plot is a more fancy plot which shows the density at each area. Lighter = more persons. So we see that the highest density of points is in the top-right quadrant explaining the positive correlation.

Jensen’s method

It is possible to use Jensen’s method (method of correlated vectors) using item response theory results, but one must use the discrimination scores (from the item plot) and the latent correlations (from the first table). If one does, one gets this:

And the result is positive as expected. We can also see that there are only two things that smarter people support banning more than less smart people: smoking in bars and motorcycles. The second one sounds pretty odd to me.

So, hopefully, someone will write this into a real paper and submit to ODP or OQSPS. At least, if I’m a co-author. If not, then submit to Intelligence or PAID, I guess.

Questions

Correlations

Item response theory factor analysis

You Might Also Like

Before PISA and Binet

Children of Immigrants Longitudinal Study (CILS): useful for testing spatial & generational transferability

Review: The raising of intelligence