Correlations and likert scales: What is the bias?

A person on ResearchGate asked the following question:

How can I correlate ordinal variables (attitude Likert scale) with continuous ratio data (years of experience)?
Currently, I am working on my dissertation which explores learning organisation characteristics at HEIs. One of the predictor demographic variables is the indication of the years of experience. Respondents were asked to fill in the gap the number of years. Should I categorise the responses instead? as for example:
1. from 1 to 4 years
2. from 4 to 10
and so on?
or is there a better choice/analysis I could apply?

My answer may also be of interest to others, so I post it here as well.

Normal practice is to treat likert scales as continuous variable even though they are not. As long as there are >=5 options, the bias from discreteness is not large.

I simulated the situation for you. I generated two variables with continuous random data from two normal distributions with a correlation of .50, N=1000. Then I created likert scales of varying levels from the second variable. Then I correlated all these variables with each other.

Correlations of continuous variable 1 with:

continuous2 0.5
likert10 0.482
likert7 0.472
likert5 0.469
likert4 0.432
likert3 0.442
likert2 0.395

So you see, introducing discreteness biases correlations towards zero, but not by much as long as likert is >=5 level. You can correct for the bias by multiplying by the correction factor if desired:

Correction factor:

continuous2 1
likert10 1.037
likert7 1.059
likert5 1.066
likert4 1.157
likert3 1.131
likert2 1.266

Psychologically, if your data does not make sense as an interval scale, i.e. if the difference between options 1-2 is not the same as between options 3-4, then you should use Spearman’s correlation instead of Pearson’s. However, it will rarely make much of a difference.

Here’s the R code.

#load library
#simulate dataset of 2 variables with correlation of .50, N=1000 = mvrnorm(1000, mu = c(0,0), Sigma = matrix(c(1,0.50,0.50,1), ncol = 2), empirical = TRUE) =;colnames( = c(“continuous1″,”continuous2″)
#divide into bins of equal length[“likert10″] = as.numeric(cut(unlist([2]),breaks=10))[“likert7″] = as.numeric(cut(unlist([2]),breaks=7))[“likert5″] = as.numeric(cut(unlist([2]),breaks=5))[“likert4″] = as.numeric(cut(unlist([2]),breaks=4))[“likert3″] = as.numeric(cut(unlist([2]),breaks=3))[“likert2″] = as.numeric(cut(unlist([2]),breaks=2))

Is the summed cubes equal to the squared sum of counting integer series?

R can tell us:

DF.numbers = data.frame(cubesum=numeric(),sumsquare=numeric()) #initial dataframe
for (n in 1:100){ #loop and fill in
  DF.numbers[n,"cubesum"] = sum((1:n)^3)
  DF.numbers[n,"sumsquare"] = sum(1:n)^2

library(car) #for the scatterplot() function
scatterplot(cubesum ~ sumsquare, DF.numbers,
            smoother=FALSE, #no moving average
            labels = rownames(DF.numbers), id.n = nrow(DF.numbers), #labels
            log = "xy", #logscales
            main = "Cubesum is identical to sumsquare, proven by induction")

#checks that they are identical, except for the name
all.equal(DF.numbers["cubesum"],DF.numbers["sumsquare"], check.names=FALSE)



One can increase the number in the loop to test more numbers. I did test it with 1:10000, and it was still true.

Paper: Revisiting a 90-year-old debate: the advantages of the mean deviation

Actually im busy doing an exam paper for linguistics class, but it turned out to be not so difficult, so i spent som time on Khan Academy doing probability and statistics courses. i want to master that stuff, especially the stuff i dont currently know the details about, like regression.

anyway, i stumpled into a comment asking about the way the standard deviation is calculated. why not just use the absolute value insted of squaring stuff and taking the square root after? i actually tried that once, and it gives different results! i tried it out becus the teacher’s notes said that it wud giv the same results. pretty neat discovery IMO.

anyway, the other one has a name as well:

here’s a paper that argues that we shud really return to the MD (mean deviation). i didnt understand all the math, but it sure is easier to calculate and the meaning of it easier to grasp, altho its probably too difficult to switch now that most of statistics is based on the SD. still cool tho.

Revisiting a 90-year-old debate the advantages of the mean deviation

ABSTRACT:  This  paper  discusses  the  reliance  of  numerical  analysis  on
the  concept  of  the  standard  deviation,  and  its  close  relative  the  variance.
It  suggests  that  the  original  reasons  why  the  standard  deviation  concept
has  permeated  traditional  statistics  are  no  longer  clearly  valid,  if  they
ever  were.  The  absolute  mean  deviation,  it  is  argued  here,  has many
advantages  over  the  standard  deviation.  It  is more  efficient  as an
estimate  of  a population  parameter  in  the  real-life  situation  where  the
data  contain  tiny  errors,  or  do  not  form  a completely  perfect  normal
distribution.  It  is  easier  to  use,  and more  tolerant  of  extreme  values,  in
the  majority  of  real-life  situations  where  population  parameters  are  not
required.  It  is  easier  for  new  researchers  to  learn  about  and  understand,
and  also  closely  linked  to  a number  of  arithmetic  techniques  already
used  in  the  sociology  of  education  and  elsewhere.  We  could  continue  to
use  the  standard  deviation  instead,  as we  do  presently,  because  so  much
of  the  rest  of  traditional  statistics  is  based  upon  it  (effect  sizes,  and  the
F-test,  for  example).  However,  we  should  weigh  the  convenience  of  this
solution  for  some  against  the  possibility  of  creating  a much  simpler  and
more  widespread  form  of  numeric  analysis  for  many.

Keywords:  variance,  measuring  variation,  political  arithmetic,  mean
deviation,  standard  deviation, social  construction  of  statistics

it also has a new odd use of “social construction” which annoyed me when reading it.

Paper: Musical beauty and information compression: Complex to the ear but simple to the mind? (Nicholas J Hudson)

I was researching a different topic and came across this paper. I was rewatching the Everything is a remix series. Then i looked up som mor relevant links, and came across these videos. One of them mentioned this article.

Complex to the ear but simple to the mind (Nicholas J Hudson)


Background: The biological origin of music, its universal appeal across human cultures and the cause of its beauty
remain mysteries. For example, why is Ludwig Van Beethoven considered a musical genius but Kylie Minogue is
not? Possible answers to these questions will be framed in the context of Information Theory.
Presentation of the Hypothesis: The entire life-long sensory data stream of a human is enormous. The adaptive
solution to this problem of scale is information compression, thought to have evolved to better handle, interpret
and store sensory data. In modern humans highly sophisticated information compression is clearly manifest in
philosophical, mathematical and scientific insights. For example, the Laws of Physics explain apparently complex
observations with simple rules. Deep cognitive insights are reported as intrinsically satisfying, implying that at some
point in evolution, the practice of successful information compression became linked to the physiological reward
system. I hypothesise that the establishment of this “compression and pleasure” connection paved the way for
musical appreciation, which subsequently became free (perhaps even inevitable) to emerge once audio
compression had become intrinsically pleasurable in its own right.
Testing the Hypothesis: For a range of compositions, empirically determine the relationship between the
listener’s pleasure and “lossless” audio compression. I hypothesise that enduring musical masterpieces will possess
an interesting objective property: despite apparent complexity, they will also exhibit high compressibility.
Implications of the Hypothesis: Artistic masterpieces and deep Scientific insights share the common process of
data compression. Musical appreciation is a parasite on a much deeper information processing capacity. The
coalescence of mathematical and musical talent in exceptional individuals has a parsimonious explanation. Musical
geniuses are skilled in composing music that appears highly complex to the ear yet transpires to be highly simple
to the mind. The listener’s pleasure is influenced by the extent to which the auditory data can be resolved in the
simplest terms possible.

Interesting, but it is way too short on data. its not that difficult to acquire som data to test this hypothesis. varius open source lossless compressors ar freely available, im thinking particularly of FLAC compressors. then one needs a juge library of music, and som sort of ranking of the music related to the quality of it. if the hypothesis is correct, then the best music shud com out on top, at least relativly within genres, or within bands etc. i think i will test this myself.

Something about certainty, proofs in math, induction/abduction

This conversation followed me posting the post just before, and several people bringing up the same proof.

Aowpwtomsihermng = Afraid of what people will think of me, so i had Emil remove my name-guy

[09:57:00] Emil – Deleet:
[09:58:50] Aowpwtomsihermng: Your mates know their algebra.
[10:00:09] Emil – Deleet: this guy is a mathematician
[10:00:27] Emil – Deleet: fysicist ppl have not chimed in yet
[10:00:32] Emil – Deleet: they are having classes i think
[10:08:18] Aowpwtomsihermng: Have you worked out the inductive proof yet?
[10:09:33] Emil – Deleet: no
[10:09:40] Emil – Deleet: i dont know how they work in detail
[10:09:43] Emil – Deleet: and it takes time
[10:09:49] Emil – Deleet: and i already crowdsourced the problem
[10:10:00] Emil – Deleet: so… doesnt pay for me to look for it
[10:10:19] Aowpwtomsihermng: CBA, right?
[10:10:24] Emil – Deleet: i didnt even need any fancy math proof to begin with
[10:10:30] Emil – Deleet: since i already proved it to my satisfaction
[10:10:54] Aowpwtomsihermng: Induction in the logical rather than mathematical sense…
[10:11:00] Emil – Deleet: yes
[10:11:17] Aowpwtomsihermng: Not as rigorous, but useful anyway.
[10:11:23] Emil – Deleet: or abduction
[10:11:46] Emil – Deleet: mathematical certainty is overrated
[10:11:48] Emil – Deleet: ;)
[10:11:59] Emil – Deleet: just look at economics
[10:12:02] Emil – Deleet: :P
[10:12:27] Aowpwtomsihermng: You never know, it might have worked for the first twenty numbers then stopped working. Unlikely, but possible.
[10:12:48] Aowpwtomsihermng: At least now you know that’s not the case.
[10:12:49] Emil – Deleet: astronomically unlikely
[10:12:56] Emil – Deleet: and i also tried other random numbers
[10:13:02] Emil – Deleet: like 3242
[10:13:21] Emil – Deleet: IMO, not much certainty was gained
[10:13:50 | Edited 10:14:04] Emil – Deleet: its approximately as likely that we missed an error in the proof as it is that abduction/induction fails in this case
[10:14:26] Aowpwtomsihermng: But once you have two or three proofs, then that likelihood drops dramatically.
[10:14:46] Emil – Deleet: perhaps
[10:15:00] Aowpwtomsihermng: But I take your point, it’s not a *great* deal of extra certainty.
[10:15:15] Emil – Deleet: for practice, its an irrelevant increase
[10:15:34] Emil – Deleet: if it comes at a great time cost – not worth it
[10:15:41] Emil – Deleet: thats what mathematicians are for ;)
[10:15:50] Emil – Deleet: (with the implication that their time isnt worth much! :D)
[10:16:55 | Edited 10:17:14] Aowpwtomsihermng: Right, right. We programmers and mathematicians are mere cogs in the machinery of your grand device.
[10:17:19] Emil – Deleet: ^^
[10:17:36] Emil – Deleet: at least ure part of something great ^^
[10:17:37] Emil – Deleet: :P

An alternative way to calculate squares.. without using multiplication

I was once at a party, and i was somewhat bored and i found this way of calculating the next square. It works without multiplication, so its suitable for mental calculation.

Seeing that i have recently learned python, here’s a python version of it:

n = 10 # how many sqs to return

b = []
def sq(x):
    return x*x
for y in range(1,n):
    print sq(y)

def sqx(x):
    if x == 1:
        return 1
    if x == 2:
        return 4
    return (sqx(x-1)-sqx(x-2))+sqx(x-1)+2

a = []
for y in range (1,n):
    print sqx(y)

In english. First, set the first two squares to 1 and 4, since this method needs to use the two previous squares to calculate the next. Then calculate the absolute difference between these two. Suppose we are looking for 32, so previous two are 1 and 4. Abs diff is 3. Add 2 to this, result 5. Add 5 to previous square, so 4+5=9. 9 is 32.

I have no idea why this works, i just saw a pattern, and confirmed it for the first 20 integers or so.

In the code above, i have defined the function recursively. It is much slower than the other function. I suppose both are slower than the low-level premade function pow(n,m). But it certainly is cool. :P

Another linguistics trip on Wiki



I just wanted to look up some stuff on the questions that a teacher had posed. Since i dont actually have the book, and since one cant search properly in paper books, i googled around instead, and ofc ended up at Wikipedia…


and it took off as usual. Here are the tabs i ended up with (36 tabs):



and with three more longer texts to consume over the next day or so: (which i had discovered independently) (long overdue)


And quite a few other longer texts in pdf form also to be read in the next few days.