{"id":4552,"date":"2014-12-24T16:27:05","date_gmt":"2014-12-24T15:27:05","guid":{"rendered":"http:\/\/emilkirkegaard.dk\/en\/?p=4552"},"modified":"2015-03-10T00:57:28","modified_gmt":"2015-03-09T23:57:28","slug":"correlations-and-likert-scales-what-is-the-bias","status":"publish","type":"post","link":"https:\/\/emilkirkegaard.dk\/en\/2014\/12\/correlations-and-likert-scales-what-is-the-bias\/","title":{"rendered":"Correlations and likert scales: What is the bias?"},"content":{"rendered":"<p><a href=\"https:\/\/www.researchgate.net\/post\/How_can_I_correlate_ordinal_variables_attitude_Likert_scale_with_continuous_ratio_data_years_of_experience#549ada9ad11b8b202f8b4586\">A person on ResearchGate asked the following question<\/a>:<\/p>\n<blockquote><p>How can I correlate ordinal variables (attitude Likert scale) with continuous ratio data (years of experience)?<br \/>\nCurrently, I am working on my dissertation which explores learning organisation characteristics at HEIs. One of the predictor demographic variables is the indication of the years of experience. Respondents were asked to fill in the gap the number of years. Should I categorise the responses instead? as for example:<br \/>\n1. from 1 to 4 years<br \/>\n2. from 4 to 10<br \/>\nand so on?<br \/>\nor is there a better choice\/analysis I could apply?<\/p><\/blockquote>\n<p>My answer may also be of interest to others, so I post it here as well.<\/p>\n<div id=\"yui_3_14_1_1_1419433087888_3064\" class=\"js-content\">\n<div id=\"rg-injektor-generated-rg_modules_publictopics_actions_PostCommentItemContentProxy_399c3df52a6ef10de188c0999121d336\" class=\"post-edit-content comment-body topic-post-text rich-text-styled js-widgetContainer\">\n<p id=\"yui_3_14_1_1_1419433087888_3063\">Normal practice is to treat likert scales as continuous variable even though they are not. As long as there are &gt;=5 options, the bias from discreteness is not large.<\/p>\n<p>I simulated the situation for you. I generated two variables with continuous random data from two normal distributions with a correlation of .50, N=1000. Then I created likert scales of varying levels from the second variable. Then I correlated all these variables with each other.<\/p>\n<p>Correlations of continuous variable 1 with:<\/p>\n<p>continuous2 0.5<br \/>\nlikert10 0.482<br \/>\nlikert7 0.472<br \/>\nlikert5 0.469<br \/>\nlikert4 0.432<br \/>\nlikert3 0.442<br \/>\nlikert2 0.395<\/p>\n<p>So you see, introducing discreteness biases correlations towards zero, but not by much as long as likert is &gt;=5 level. You can correct for the bias by multiplying by the correction factor if desired:<\/p>\n<p>Correction factor:<\/p>\n<p>continuous2 1<br \/>\nlikert10 1.037<br \/>\nlikert7 1.059<br \/>\nlikert5 1.066<br \/>\nlikert4 1.157<br \/>\nlikert3 1.131<br \/>\nlikert2 1.266<\/p>\n<p>Psychologically, if your data does not make sense as an interval scale, i.e. if the difference between options 1-2 is not the same as between options 3-4, then you should use Spearman&#8217;s correlation instead of Pearson&#8217;s. However, it will rarely make much of a difference.<\/p>\n<p>Here&#8217;s the R code.<\/p>\n<p id=\"yui_3_14_1_1_1419433087888_3072\"><em>#load library<\/em><br \/>\n<em>library(MASS)<\/em><br \/>\n<em>#simulate dataset of 2 variables with correlation of .50, N=1000<\/em><br \/>\n<em>simul.data = mvrnorm(1000, mu = c(0,0), Sigma = matrix(c(1,0.50,0.50,1), ncol = 2), empirical = TRUE)<\/em><br \/>\n<em>simul.data = as.data.frame(simul.data);colnames(simul.data) = c(&#8220;continuous1&#8243;,&#8221;continuous2&#8221;)<\/em><br \/>\n<em>#divide into bins of equal length<\/em><br \/>\n<em>simul.data[&#8220;likert10&#8221;] = as.numeric(cut(unlist(simul.data[2]),breaks=10))<\/em><br \/>\n<em>simul.data[&#8220;likert7&#8221;] = as.numeric(cut(unlist(simul.data[2]),breaks=7))<\/em><br \/>\n<em>simul.data[&#8220;likert5&#8221;] = as.numeric(cut(unlist(simul.data[2]),breaks=5))<\/em><br \/>\n<em>simul.data[&#8220;likert4&#8221;] = as.numeric(cut(unlist(simul.data[2]),breaks=4))<\/em><br \/>\n<em>simul.data[&#8220;likert3&#8221;] = as.numeric(cut(unlist(simul.data[2]),breaks=3))<\/em><br \/>\n<em>simul.data[&#8220;likert2&#8221;] = as.numeric(cut(unlist(simul.data[2]),breaks=2))<\/em><br \/>\n<em>#correlations<\/em><br \/>\n<em>round(cor(simul.data),3)<\/em><\/p>\n<\/div>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>A person on ResearchGate asked the following question: How can I correlate ordinal variables (attitude Likert scale) with continuous ratio data (years of experience)? Currently, I am working on my dissertation which explores learning organisation characteristics at HEIs. One of the predictor demographic variables is the indication of the years of experience. Respondents were asked [&hellip;]<\/p>\n","protected":false},"author":17,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1766],"tags":[2057,1713,2056,1202],"class_list":["post-4552","post","type-post","status-publish","format-standard","hentry","category-math-science","tag-bias","tag-correlation","tag-likert","tag-statistics","entry"],"_links":{"self":[{"href":"https:\/\/emilkirkegaard.dk\/en\/wp-json\/wp\/v2\/posts\/4552","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/emilkirkegaard.dk\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/emilkirkegaard.dk\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/emilkirkegaard.dk\/en\/wp-json\/wp\/v2\/users\/17"}],"replies":[{"embeddable":true,"href":"https:\/\/emilkirkegaard.dk\/en\/wp-json\/wp\/v2\/comments?post=4552"}],"version-history":[{"count":1,"href":"https:\/\/emilkirkegaard.dk\/en\/wp-json\/wp\/v2\/posts\/4552\/revisions"}],"predecessor-version":[{"id":4553,"href":"https:\/\/emilkirkegaard.dk\/en\/wp-json\/wp\/v2\/posts\/4552\/revisions\/4553"}],"wp:attachment":[{"href":"https:\/\/emilkirkegaard.dk\/en\/wp-json\/wp\/v2\/media?parent=4552"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/emilkirkegaard.dk\/en\/wp-json\/wp\/v2\/categories?post=4552"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/emilkirkegaard.dk\/en\/wp-json\/wp\/v2\/tags?post=4552"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}