{"id":3241,"date":"2012-09-17T13:19:39","date_gmt":"2012-09-17T12:19:39","guid":{"rendered":"http:\/\/emilkirkegaard.dk\/en\/?p=3241"},"modified":"2012-09-17T13:21:18","modified_gmt":"2012-09-17T12:21:18","slug":"mining-data-from-okcupid-using-okcs-questions-as-an-iq-test","status":"publish","type":"post","link":"https:\/\/emilkirkegaard.dk\/en\/2012\/09\/mining-data-from-okcupid-using-okcs-questions-as-an-iq-test\/","title":{"rendered":"Mining data from OKCupid: using OKC\u2019s questions as an IQ-test"},"content":{"rendered":"<p>Is it possible? Yes. In the exact same way that one normally makes an IQ-test, one can do so with OKC.<\/p>\n<ol>\n<li>Handpick a lot of questions that have to do with intelligence.<\/li>\n<li>Mine a huge data set with those questions and people\u2019s answers to them.<\/li>\n<li>Do a factor analysis.<\/li>\n<li>The g-factor shud turn up. If it doesn\u2019t (1) failed.<\/li>\n<li>Calculate how each question+answer set correlates with g, that is, how g-loaded it is.<\/li>\n<li>Use the questions+answers to infer people\u2019s IQs.<\/li>\n<\/ol>\n<p>Then, after that has been done:<\/p>\n<ul>\n<li>Mine some more data from people\u2019s profiles, and look for correlations with IQ.<\/li>\n<li>Do some stats.<\/li>\n<li>Share the data.<\/li>\n<\/ul>\n<p>Technically feasible?<br \/>\nYes. But the OKC staff might block it if one is not careful. Since to be able to search one has to be logged in. But to see the profiles, one does not need to login. But if one views some 10,000 profiles in a day, they might get suspicious and close the profile. One can get around this by doing the search while logged in, and then mining the information without being logged in (perhaps from multiple different IP addresses).<\/p>\n<p>Difficulty in finding questions+answers? Sort of. One shud crawl users questions page, and use one of the four first sorting options &#8211; <a href=\"http:\/\/www.okcupid.com\/profile\/Filomath\/questions?interest=1\">Sorted by magic<\/a>, He cares about, Answered recently, Answered in ancient times &#8211; as these include all the questions that a given user has answered publicly.<\/p>\n<p>Ethical, legal?<br \/>\nYes and yes. Legally, people have answered these questions publicly, thus knowing that others can read them. That\u2019s the whole idea with answering them publicly as opposed to privately. In both cases they are used for OKC\u2019s matching algorithms.<\/p>\n<p>I don\u2019t see any ethical problems in doing this. Altho i wudn\u2019t try to popularize picture+IQ score combinations.<\/p>\n<p>Endless study possibilities<br \/>\nOKC has a wealth of information about people, not just intelligence which can be inferred. It is truly a gold mine for psychological research. I note that it is also useful for my ongoing study: <a href=\"http:\/\/emilkirkegaard.dk\/intelligence_from_picture_study\">http:\/\/emilkirkegaard.dk\/intelligence_from_picture_study<\/a>.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Is it possible? Yes. In the exact same way that one normally makes an IQ-test, one can do so with OKC. Handpick a lot of questions that have to do with intelligence. Mine a huge data set with those questions and people\u2019s answers to them. Do a factor analysis. The g-factor shud turn up. If [&hellip;]<\/p>\n","protected":false},"author":17,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1839,1653],"tags":[1928],"class_list":["post-3241","post","type-post","status-publish","format-standard","hentry","category-psychometics","category-psychology","tag-computer-science","entry"],"_links":{"self":[{"href":"https:\/\/emilkirkegaard.dk\/en\/wp-json\/wp\/v2\/posts\/3241","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/emilkirkegaard.dk\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/emilkirkegaard.dk\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/emilkirkegaard.dk\/en\/wp-json\/wp\/v2\/users\/17"}],"replies":[{"embeddable":true,"href":"https:\/\/emilkirkegaard.dk\/en\/wp-json\/wp\/v2\/comments?post=3241"}],"version-history":[{"count":2,"href":"https:\/\/emilkirkegaard.dk\/en\/wp-json\/wp\/v2\/posts\/3241\/revisions"}],"predecessor-version":[{"id":3243,"href":"https:\/\/emilkirkegaard.dk\/en\/wp-json\/wp\/v2\/posts\/3241\/revisions\/3243"}],"wp:attachment":[{"href":"https:\/\/emilkirkegaard.dk\/en\/wp-json\/wp\/v2\/media?parent=3241"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/emilkirkegaard.dk\/en\/wp-json\/wp\/v2\/categories?post=3241"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/emilkirkegaard.dk\/en\/wp-json\/wp\/v2\/tags?post=3241"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}