Mining data from OKCupid: using OKC’s questions as an IQ-test

Is it possible? Yes. In the exact same way that one normally makes an IQ-test, one can do so with OKC.

  1. Handpick a lot of questions that have to do with intelligence.
  2. Mine a huge data set with those questions and people’s answers to them.
  3. Do a factor analysis.
  4. The g-factor shud turn up. If it doesn’t (1) failed.
  5. Calculate how each question+answer set correlates with g, that is, how g-loaded it is.
  6. Use the questions+answers to infer people’s IQs.

Then, after that has been done:

  • Mine some more data from people’s profiles, and look for correlations with IQ.
  • Do some stats.
  • Share the data.

Technically feasible?
Yes. But the OKC staff might block it if one is not careful. Since to be able to search one has to be logged in. But to see the profiles, one does not need to login. But if one views some 10,000 profiles in a day, they might get suspicious and close the profile. One can get around this by doing the search while logged in, and then mining the information without being logged in (perhaps from multiple different IP addresses).

Difficulty in finding questions+answers? Sort of. One shud crawl users questions page, and use one of the four first sorting options – Sorted by magic, He cares about, Answered recently, Answered in ancient times – as these include all the questions that a given user has answered publicly.

Ethical, legal?
Yes and yes. Legally, people have answered these questions publicly, thus knowing that others can read them. That’s the whole idea with answering them publicly as opposed to privately. In both cases they are used for OKC’s matching algorithms.

I don’t see any ethical problems in doing this. Altho i wudn’t try to popularize picture+IQ score combinations.

Endless study possibilities
OKC has a wealth of information about people, not just intelligence which can be inferred. It is truly a gold mine for psychological research. I note that it is also useful for my ongoing study:

