kirkegaard: df_add_delta()

One idea for a series of blog posts is that I could about new functions in my R package. Often I just push these without letting anyone know, but I guess it could be useful to make an introduction for them (the more interesting ones anyway) here.

Function description: Adds delta (difference) columns to a data.frame. These are made from one primary variable and a number of secondary variables. Variables can be given either by indices or by name. If no secondary variables are given, all numeric variables are used.


> iris %>% head %>% df_add_delta(1)
  Sepal.Length Sepal.Width Petal.Length Petal.Width Species delta_Sepal.Length_Sepal.Width
1          5.1         3.5          1.4         0.2  setosa                            1.6
2          4.9         3.0          1.4         0.2  setosa                            1.9
3          4.7         3.2          1.3         0.2  setosa                            1.5
4          4.6         3.1          1.5         0.2  setosa                            1.5
5          5.0         3.6          1.4         0.2  setosa                            1.4
6          5.4         3.9          1.7         0.4  setosa                            1.5
  delta_Sepal.Length_Petal.Length delta_Sepal.Length_Petal.Width
1                             3.7                            4.9
2                             3.5                            4.7
3                             3.4                            4.5
4                             3.1                            4.4
5                             3.6                            4.8
6                             3.7                            5.0

So, we see that three variables were created based on a prefix and a separator (both configurable). The difference scores are given in natural units, but can also be standardized automatically if desired.

Three delta vars were made because we chose var 1 as the primary and it automatically selects the remaining numeric vars as secondaries.