Installing the latest version of R on Ubuntu/Mint

I wrote about this before, but since this is a frequent problem and my last post wasn’t brief, here’s a shorter version. The primary way to install software in Linux is to rely on apt-get (apt in Mint) or some other package manager. The way this works is that there is a central server which…

Continue Reading

R functions for analyzing missing data

I’m reading Missing Data: A Gentle Introduction and it mentions various methods to understand how data are missing in a given dataset. The book, however, is light on actual tools. So, since I have already implemented a few functions in my package for handling missing data, I decided to implement a few more. These have…

Continue Reading

kirkegaard: Plot contingency table with ggplot2

This is a post in the on-going series about stuff in my package: kirkegaard [I’m not egocentric but since there is no central theme about the functions in the package other than I made and use them, there is nothing else to call it.] I figure it should be easy to find someone who wrote…

Continue Reading

Making use of list-arrays in R

Quick recap of the main object types in R from Advanced R: homogeneous heterogeneous 1-d atomic vector list 2-d matrix data.frame n-d array ??? So, objects can either store only data of the same type or of any type, and they can have 1, 2 or any number of dimensions. Note that there is a…

Continue Reading

R: fastest way of finding out of all elements of a vector are identical?

There is a question on SO about this: stackoverflow.com/questions/4752275/test-for-equality-among-all-elements-of-a-single-vector But I was a bit more curious, so! #test data, large vectors v1 = rep(1234, 1e6) v2 = runif(1e6) #functions to try all_the_same1 = function(x) {   range(x) == 0 } all_the_same2 = function(x) {   max(x) == min(x) } all_the_same3 = function(x) {   sd(x)…

Continue Reading

R: assign() inside nested functions

Recently, I wrote a function called copy_names(). It does what you think and a little more: it copies names from one object to another. But it can also attempt to do so even when the sizes of the objects’ dimensions do not match up perfectly. For instance: > t = matrix(1:9, nrow=3) > t2 =…

Continue Reading

kirkegaard: conditional recoding with conditional_change()

Usually working with large public datasets requires that one recode variables. This can be quite repetitive. When variables only have a few possible values, one can use something like plyr‘s mapvalues() for great benefit (see my answer at SO). However, when there is an indefinite number of different values, it is not useful. What one…

Continue Reading

Ethnic heterogeneity and tail effects

Chisala has his 3rd installment up: www.unz.com/article/closing-the-black-white-iq-gap-debate-part-3/ One idea I had while reading it was that tail effects interact with population ethnic/racial heterogeneity. To show this, I did a simulation experiment. Population 1 is a regular population with a mean of 0 and sd of 1. Population 2 is a composite population of three sub-populations:…

Continue Reading