**Abstract**

A reanalysis of (Carl, 2015) revealed that the inclusion of London had a strong effect on the S loading of crime and poverty variables. S factor scores from a dataset without London and redundant variables was strongly related to IQ scores, r = .87. The Jensen coefficient for this relationship was .86.

**Introduction**

Carl (2015) analyzed socioeconomic inequality across 12 regions of the UK. In my reading of his paper, I thought of several analyses that Carl had not done. I therefore asked him for the data and he shared it with me. For a fuller description of the data sources, refer back to his article.

**Redundant variables and London**

Including (nearly) perfectly correlated variables can skew an extracted factor. For this reason, I created an alternative dataset where variables that correlated above |.90| were removed. The following pairs of strongly correlated variables were found:

- median.weekly.earnings and log.weekly.earnings r=0.999
- GVA.per.capita and log.GVA.per.capita r=0.997
- R.D.workers.per.capita and log.weekly.earnings r=0.955
- log.GVA.per.capita and log.weekly.earnings r=0.925
- economic.inactivity and children.workless.households r=0.914

In each case, the first of the pair was removed from the dataset. However, this resulted in a dataset with 11 cases and 11 variables, which is impossible to factor analyze. For this reason, I left in the last pair.

Furthermore, because capitals are known to sometimes strongly affect results (Kirkegaard, 2015a, 2015b, 2015d), I also created two further datasets without London: one with the redundant variables, one without. Thus, there were 4 datasets:

- A dataset with London and redundant variables.
- A dataset with redundant variables but without London.
- A dataset with London but without redundant variables.
- A dataset without London and redundant variables.

**Factor analysis**

Each of the four datasets was factor analyzed. Figure 1 shows the loadings.

*Figure 1: S factor loadings in four analyses.*

Removing London strongly affected the loading of the crime variable, which changed from moderately positive to moderately negative. The poverty variable also saw a large change, from slightly negative to strongly negative. Both changes are in the direction towards a purer S factor (desirable outcomes with positive loadings, undesirable outcomes with negative loadings). Removing the redundant variables did not have much effect.

As a check, I investigated whether these results were stable across 30 different factor analytic methods.^{1} They were, all loadings and scores correlated near 1.00. For my analysis, I used those extracted with the combination of minimum residuals and regression.

**Mixedness**

Due to London’s strong effect on the loadings, one should check that the two methods developed for finding such cases can identify it (Kirkegaard, 2015c). Figure 2 shows the results from these two methods (mean absolute residual and change in factor size):

*Figure 2: Mixedness metrics for the complete dataset.*

As can be seen, London was identified as a far outlier using both methods.

**S scores and IQ**

Carl’s dataset also contains IQ scores for the regions. These correlate .87 with the S factor scores from the dataset without London and redundant variables. Figure 3 shows the scatter plot.

*Figure 3: Scatter plot of S and IQ scores for regions of the UK.*

However, it is possible that IQ is not really related to the latent S factor, just the other variance of the extracted S scores. For this reason I used Jensen’s method (method of correlated vectors) (Jensen, 1998). Figure 4 shows the results.

*Figure 4: Jensen’s method for the S factor’s relationship to IQ scores.*

Jensen’s method thus supported the claim that IQ scores and the latent S factor are related.

**Discussion and conclusion**

My reanalysis revealed some interesting results regarding the effect of London on the loadings. This was made possible by data sharing demonstrating the importance of this practice (Wicherts & Bakker, 2012).

**Supplementary material**

R source code and datasets are available at the OSF.

**References**

Carl, N. (2015). IQ and socioeconomic development across Regions of the UK. *Journal of Biosocial Science*, 1–12. http://doi.org/10.1017/S002193201500019X

Jensen, A. R. (1998). *The g factor: the science of mental ability*. Westport, Conn.: Praeger.

Kirkegaard, E. O. W. (2015a). Examining the S factor in Mexican states. *The Winnower*. Retrieved from https://thewinnower.com/papers/examining-the-s-factor-in-mexican-states

Kirkegaard, E. O. W. (2015b). Examining the S factor in US states. *The Winnower*. Retrieved from https://thewinnower.com/papers/examining-the-s-factor-in-us-states

Kirkegaard, E. O. W. (2015c). Finding mixed cases in exploratory factor analysis. *The Winnower*. Retrieved from https://thewinnower.com/papers/finding-mixed-cases-in-exploratory-factor-analysis

Kirkegaard, E. O. W. (2015d). The S factor in Brazilian states. *The Winnower*. Retrieved from https://thewinnower.com/papers/the-s-factor-in-brazilian-states

Revelle, W. (2015). psych: Procedures for Psychological, Psychometric, and Personality Research (Version 1.5.4). Retrieved from http://cran.r-project.org/web/packages/psych/index.html

Wicherts, J. M., & Bakker, M. (2012). Publish (your data) or (let the data) perish! Why not publish your data too? *Intelligence*, *40*(2), 73–76. http://doi.org/10.1016/j.intell.2012.01.004

1There are 6 different extraction and 5 scoring methods supported by the fa() function from the psych package (Revelle, 2015). Thus, there are 6*5 combinations.