The g factor and principal components regression

Jensen spent decades trying to convince people that the g factor (general intelligence) is the primary reason why IQ tests predict stuff, not whatever mental abilities the tests appear to measure or what the makers wanted to measure. E.g. as written in:

As for the tests themselves, and for many of the real-life tasks and demands on which performance is to some degree predictable from the most g-loaded tests, it appears generally that g is associated with the relative degree of complexity of the tests’ or tasks’ cognitive demands. It is well known that test batteries that measure IQ are good predictors of educational achievement and occupational level (Jensen, 1993a). Perhaps less well- known is the fact that g is the chief “active ingredient” in this predictive validity more than any of the specific knowledge and skills content of the tests. If g were statistically removed from IQ and scholastic aptitude tests, they would have no practically useful predictive validity. This is not to say that certain group factors (e.g., verbal, numerical, spatial, and memory) in these tests do not enhance the predictive validity, but their effect is relatively small compared to g.

It’s funny because if we go to another field, we find that the generalized version of this finding is not controversial at all, but a common assumption!

The principal components regression (PCR) approach involves constructing the first M principal components, Z 1 , . . . , Z M , and then using these components as the predictors in a linear regression model that is fit using least squares. The key idea is that often a small number of principal components suffice to explain most of the variability in the data, as well as the relationship with the response. In other words, we assume that the directions in which X 1 , . . . , X p show the most variation are the directions that are associated with Y . While this assumption is not guaranteed to be true, it often turns out to be a reasonable enough approximation to give good results.

If the assumption underlying PCR holds, then fitting a least squares model to Z 1 , . . . , Z M will lead to better results than fitting a least squares model to X 1 , . . . , X p , since most or all of the information in the data that relates to the response is contained in Z 1 , . . . , Z M , and by estimating only M  p coefficients we can mitigate overfitting.

I wasn’t able to find anyone who had noticed this connection before, but it’s somewhat remarkable in hindsight. The active ingredient status of g in cognitive data is just a specific case of validity of the assumption underlying the effectiveness of principal components regression. One can also quantify the validity of this assumption for a given domain by looking at how much of the validity is concentrated in the first components (similar to this idea in genomics). My hypothesis a few years ago was that for personality, validity is quite distributed such that using few latent variables will not work so well compared to the entire set of personality variables (items). Functionally, this is basically the opposite of what the general factor of personality (GFP) people are saying. Revelle and colleagues has a recent paper showing this to be true. In a project underway, I have shown that cognitive low-level data (items with response-level data) contain a lot more validity as well (5 to 40% more; in the personality study, it was >100% more for items than for OCEAN), but not as much extra validity as the personality data. Thus, as usual, we find that Jensen was approximately correct — cognitive data is mostly predictive because of the g factor.

Leave a Reply