{"id":5718,"date":"2015-12-30T21:06:50","date_gmt":"2015-12-30T20:06:50","guid":{"rendered":"http:\/\/emilkirkegaard.dk\/en\/?p=5718"},"modified":"2015-12-30T21:06:50","modified_gmt":"2015-12-30T20:06:50","slug":"r-assign-inside-nested-functions","status":"publish","type":"post","link":"https:\/\/emilkirkegaard.dk\/en\/2015\/12\/r-assign-inside-nested-functions\/","title":{"rendered":"R: assign() inside nested functions"},"content":{"rendered":"<p>Recently, I wrote a function called copy_names(). It does what you think and a little more: it copies names from one object to another. But it can also attempt to do so even when the sizes of the objects&#8217; dimensions do not match up perfectly. For instance:<\/p>\n<pre id=\"rstudio_console_output\" class=\"GEM3DMTCFGB\" tabindex=\"0\"><span class=\"GEM3DMTCLGB ace_keyword\">&gt; <\/span><span class=\"GEM3DMTCLFB ace_keyword\">t = matrix(1:9, nrow=3)\r\n<\/span><span class=\"GEM3DMTCLGB ace_keyword\">&gt; <\/span><span class=\"GEM3DMTCLFB ace_keyword\">t2 = t\r\n<\/span><span class=\"GEM3DMTCLGB ace_keyword\">&gt; <\/span><span class=\"GEM3DMTCLFB ace_keyword\">rownames(t) = LETTERS[1:3]; colnames(t) = letters[1:3]\r\n<\/span><span class=\"GEM3DMTCLGB ace_keyword\">&gt; <\/span><span class=\"GEM3DMTCLFB ace_keyword\">t\r\n<\/span>  a b c\r\nA 1 4 7\r\nB 2 5 8\r\nC 3 6 9\r\n<span class=\"GEM3DMTCLGB ace_keyword\">&gt; <\/span><span class=\"GEM3DMTCLFB ace_keyword\">t2\r\n<\/span>     [,1] [,2] [,3]\r\n[1,]    1    4    7\r\n[2,]    2    5    8\r\n[3,]    3    6    9\r\n<span class=\"GEM3DMTCLGB ace_keyword\">&gt; <\/span><span class=\"GEM3DMTCLFB ace_keyword\">copy_names(t, t2)\r\n<\/span><span class=\"GEM3DMTCLGB ace_keyword\">&gt; <\/span><span class=\"GEM3DMTCLFB ace_keyword\">t2\r\n<\/span>  a b c\r\nA 1 4 7\r\nB 2 5 8\r\nC 3 6 9<\/pre>\n<p>Here we create a matrix and make a copy of it. Then we assign dimension names to the first object. Then we inspect both of them. Unsurprisingly, only the first has names (because R uses <a href=\"http:\/\/adv-r.had.co.nz\/Functions.html#return-values\">copy-on-modify semantics<\/a>). Then we call the copy function and then afterwards we see that the second gets the named copied. Hooray!<\/p>\n<p>What if there is imperfect matching? The function will first check whether the number of dimensions is the same and if so, it checks each dimension to see if the lengths match in that dimension. If so, the names are copied. If not, nothing is done. For instance:<\/p>\n<pre id=\"rstudio_console_output\" class=\"GEM3DMTCFGB\" tabindex=\"0\"><span class=\"GEM3DMTCLGB ace_keyword\">&gt; <\/span><span class=\"GEM3DMTCLFB ace_keyword\">t = matrix(1:6, nrow=3)\r\n<\/span><span class=\"GEM3DMTCLGB ace_keyword\">&gt; <\/span><span class=\"GEM3DMTCLFB ace_keyword\">t2 = matrix(1:9, nrow=3)\r\n<\/span><span class=\"GEM3DMTCLGB ace_keyword\">&gt; <\/span><span class=\"GEM3DMTCLFB ace_keyword\">rownames(t) = LETTERS[1:3]; colnames(t) = letters[1:2]\r\n<\/span><span class=\"GEM3DMTCLGB ace_keyword\">&gt; <\/span><span class=\"GEM3DMTCLFB ace_keyword\">t\r\n<\/span>  a b\r\nA 1 4\r\nB 2 5\r\nC 3 6\r\n<span class=\"GEM3DMTCLGB ace_keyword\">&gt; <\/span><span class=\"GEM3DMTCLFB ace_keyword\">t2\r\n<\/span>     [,1] [,2] [,3]\r\n[1,]    1    4    7\r\n[2,]    2    5    8\r\n[3,]    3    6    9\r\n<span class=\"GEM3DMTCLGB ace_keyword\">&gt; <\/span><span class=\"GEM3DMTCLFB ace_keyword\">copy_names(t, t2)\r\n<\/span><span class=\"GEM3DMTCLGB ace_keyword\">&gt; <\/span><span class=\"GEM3DMTCLFB ace_keyword\">t2\r\n<\/span>  [,1] [,2] [,3]\r\nA    1    4    7\r\nB    2    5    8\r\nC    3    6    9<\/pre>\n<p>Here we create two matrices, but not of exactly the same sizes: the first is 3&#215;2 and the second is 3&#215;3. Then we assign dimnames to the first. Then we copy to the second and inspect. We see that the only the dimension that matched in length (i.e. the first) had the names copied.<\/p>\n<h3>How does it work?<\/h3>\n<p>Before I changed it, the code looked like this (including <a href=\"https:\/\/cran.r-project.org\/web\/packages\/roxygen2\/vignettes\/roxygen2.html\">roxygen2 documentation<\/a>):<\/p>\n<pre>#' Copy names from one object to another.\r\n#'\r\n#' Attempts to copy names that fit the dimensions of vectors, lists, matrices and data.frames.\r\n#' @param x (an object) An object whose dimnames should be copied.\r\n#' @param y (an object) An object whose dimensions that should be renamed.\r\n#' @keywords names, rownames, colnames, copy\r\n#' @export\r\n#' @examples\r\n#' m = matrix(1:9, nrow=3)\r\n#' n = m\r\n#' rownames(m) = letters[1:3]\r\n#' colnames(m) = LETTERS[1:3]\r\n#' copy_names(m, n)\r\n#' n\r\ncopy_names = function(x, y, partialmatching = T) {\r\n\u00a0 library(stringr)\r\n\u00a0 #find object dimensions\r\n\u00a0 x_dims = get_dims(x)\r\n\u00a0 y_dims = get_dims(y)\r\n\u00a0 same_n_dimensions = length(x_dims) == length(y_dims)\r\n\r\n\u00a0 #what is the object in y parameter?\r\n\u00a0 y_obj_name = deparse(substitute(y))\r\n\r\n\u00a0 #perfect matching\r\n\u00a0 if (!partialmatching) {\r\n\u00a0\u00a0\u00a0 #set names if matching dims\r\n\u00a0\u00a0\u00a0 if (all(x_dims == y_dims)) {\r\n\u00a0\u00a0\u00a0\u00a0\u00a0 attr(y, \"dimnames\") = attr(x, \"dimnames\")\r\n\u00a0\u00a0\u00a0 } else {\r\n\u00a0\u00a0\u00a0\u00a0\u00a0 stop(str_c(\"Dimensions did not match! \", x_dims, \" vs. \", y_dims))\r\n\u00a0\u00a0\u00a0 }\r\n\u00a0 }\r\n\r\n\u00a0 #if using partial matching and dimensions match in number\r\n\u00a0 if (same_n_dimensions &amp;&amp; partialmatching) {\r\n\u00a0\u00a0\u00a0 #loop over each dimension\r\n\u00a0\u00a0\u00a0 for (dim in 1:length(dimnames(x))) {\r\n\u00a0\u00a0\u00a0\u00a0\u00a0 #do lengths match?\r\n\u00a0\u00a0\u00a0\u00a0\u00a0 if (x_dims[dim] == y_dims[dim]) {\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 dimnames(y)[[dim]] = dimnames(x)[[dim]]\r\n\u00a0\u00a0\u00a0\u00a0\u00a0 }\r\n\u00a0\u00a0\u00a0 }\r\n\u00a0 }\r\n\r\n\u00a0 #assign in the outer envir\r\n\u00a0 assign(y_obj_name, value = y, pos = 1)\r\n}<\/pre>\n<p>The call that does the trick is the last one, namely the one using <em>assign()<\/em>. Here we modify an object outside <a href=\"http:\/\/adv-r.had.co.nz\/Environments.html#function-envs\">the function&#8217;s own environment<\/a>. How do we know which one to modify? Well, we take one step back (pos = 1). Alternatively, one could have used <em>&lt;&lt;-<\/em>.<\/p>\n<h3>Inside nested functions<\/h3>\n<p>However, consider this scenario:<\/p>\n<pre id=\"rstudio_console_output\" class=\"GEM3DMTCFGB\" tabindex=\"0\"><span class=\"GEM3DMTCLGB ace_keyword\">&gt; <\/span><span class=\"GEM3DMTCLFB ace_keyword\">x = 1\r\n<\/span><span class=\"GEM3DMTCLGB ace_keyword\">&gt; <\/span><span class=\"GEM3DMTCLFB ace_keyword\">func1 = function() {\r\n<\/span><span class=\"GEM3DMTCLGB ace_keyword\">+ <\/span><span class=\"GEM3DMTCLFB ace_keyword\">  x = 2\r\n<\/span><span class=\"GEM3DMTCLGB ace_keyword\">+ <\/span><span class=\"GEM3DMTCLFB ace_keyword\">  print(paste0(\"x inside func1 before running func2 is \", x))\r\n<\/span><span class=\"GEM3DMTCLGB ace_keyword\">+ <\/span><span class=\"GEM3DMTCLFB ace_keyword\">  func2()\r\n<\/span><span class=\"GEM3DMTCLGB ace_keyword\">+ <\/span><span class=\"GEM3DMTCLFB ace_keyword\">  print(paste0(\"x inside func1 after running func2 is \", x))\r\n<\/span><span class=\"GEM3DMTCLGB ace_keyword\">+ <\/span><span class=\"GEM3DMTCLFB ace_keyword\">}\r\n<\/span><span class=\"GEM3DMTCLGB ace_keyword\">&gt; <\/span>\r\n<span class=\"GEM3DMTCLGB ace_keyword\">&gt; <\/span><span class=\"GEM3DMTCLFB ace_keyword\">func2 = function() {\r\n<\/span><span class=\"GEM3DMTCLGB ace_keyword\">+ <\/span><span class=\"GEM3DMTCLFB ace_keyword\">  print(paste0(\"x inside func2 is \", x))\r\n<\/span><span class=\"GEM3DMTCLGB ace_keyword\">+ <\/span><span class=\"GEM3DMTCLFB ace_keyword\">  print(where(\"x\"))\r\n<\/span><span class=\"GEM3DMTCLGB ace_keyword\">+ <\/span><span class=\"GEM3DMTCLFB ace_keyword\">  assign(\"x\", value = 3, pos = 1)\r\n<\/span><span class=\"GEM3DMTCLGB ace_keyword\">+ <\/span><span class=\"GEM3DMTCLFB ace_keyword\">  #x &lt;&lt;- 3\r\n<\/span><span class=\"GEM3DMTCLGB ace_keyword\">+ <\/span><span class=\"GEM3DMTCLFB ace_keyword\">}\r\n<\/span><span class=\"GEM3DMTCLGB ace_keyword\">&gt; <\/span>\r\n<span class=\"GEM3DMTCLGB ace_keyword\">&gt; <\/span><span class=\"GEM3DMTCLFB ace_keyword\">x\r\n<\/span>[1] 1\r\n<span class=\"GEM3DMTCLGB ace_keyword\">&gt; <\/span><span class=\"GEM3DMTCLFB ace_keyword\">func1()\r\n<\/span>[1] \"x inside func1 before running func2 is 2\"\r\n[1] \"x inside func2 is 1\"\r\n&lt;environment: R_GlobalEnv&gt;\r\n[1] \"x inside func1 after running func2 is 2\"\r\n<span class=\"GEM3DMTCLGB ace_keyword\">&gt; <\/span><span class=\"GEM3DMTCLFB ace_keyword\">x\r\n<\/span>[1] 3\r\n<span class=\"GEM3DMTCLGB ace_keyword\">&gt; <\/span>\r\n<span class=\"GEM3DMTCLGB ace_keyword\">&gt; <\/span><span class=\"GEM3DMTCLFB ace_keyword\">x = 1\r\n<\/span><span class=\"GEM3DMTCLGB ace_keyword\">&gt; <\/span><span class=\"GEM3DMTCLFB ace_keyword\">func2()\r\n<\/span>[1] \"x inside func2 is 1\"\r\n&lt;environment: R_GlobalEnv&gt;\r\n<span class=\"GEM3DMTCLGB ace_keyword\">&gt; <\/span><span class=\"GEM3DMTCLFB ace_keyword\">x\r\n<\/span>[1] 3<\/pre>\n<p>Here we define two functions, one of which calls the other. We also define x outside (in the global environment). Inside func1() we also define x to be another value. However, note the strange result inside func2. When asked to fetch x, which doesn&#8217;t exist in that function&#8217;s environment, it returns the value from the&#8230; global environment (i.e. x=1), not the func1() environment (x=2)! This is odd because func2() was called from func1(), so one would expect it to try getting it from there before trying the global environment. When we then call x in the global environment after the functions finish, we see that x has been changed there, not inside func2() as might be expected. This is a problem because if we call copy_names() inside a function, it is supposed to change the names of the object inside the function, not inside the global environment.<\/p>\n<p>Why is this? <a href=\"http:\/\/adv-r.had.co.nz\/Environments.html#function-envs\">It is complicated<\/a>, but as far as I can make out, it is due to the difference between the <em>calling environment<\/em> (where we call the function from) and the <em>enclosing environment<\/em> (where it was created, in the case above the global environment). R by default will look up variables in the enclosing environment, not the calling environment. assign() using pos = 1 apparently does not work with the calling environments, but the enclosing environments, and hence it changes the value in the global environment, not the function that called it&#8217;s environment as intended.<\/p>\n<p>The fix is to use the following line instead:<\/p>\n<pre>assign(\"x\", value = 3, envir = parent.frame())<\/pre>\n<p>which then assigns the value to the object in the right environment, namely in func1()&#8217;s.<\/p>\n<h3>copy_names() part 2<\/h3>\n<p>This also means that copy_names() does not work within functions. For instance:<\/p>\n<pre id=\"rstudio_console_output\" class=\"GEM3DMTCFGB\" tabindex=\"0\"><span class=\"GEM3DMTCLGB ace_keyword\">&gt; <\/span><span class=\"GEM3DMTCLFB ace_keyword\">get_loadings = function(fa) {\r\n<\/span><span class=\"GEM3DMTCLGB ace_keyword\">+ <\/span><span class=\"GEM3DMTCLFB ace_keyword\">library(magrittr)\r\n<\/span><span class=\"GEM3DMTCLGB ace_keyword\">+ <\/span><span class=\"GEM3DMTCLFB ace_keyword\">df = loadings(fa) %&gt;% as.vector %&gt;% matrix(nrow=nrow(fa$loadings)) %&gt;% as.data.frame\r\n<\/span><span class=\"GEM3DMTCLGB ace_keyword\">+ <\/span><span class=\"GEM3DMTCLFB ace_keyword\">loads = loadings(fa)\r\n<\/span><span class=\"GEM3DMTCLGB ace_keyword\">+ <\/span><span class=\"GEM3DMTCLFB ace_keyword\">copy_names(loads, df)\r\n<\/span><span class=\"GEM3DMTCLGB ace_keyword\">+ <\/span><span class=\"GEM3DMTCLFB ace_keyword\">return(df)\r\n<\/span><span class=\"GEM3DMTCLGB ace_keyword\">+ <\/span><span class=\"GEM3DMTCLFB ace_keyword\">}\r\n<\/span><span class=\"GEM3DMTCLGB ace_keyword\">&gt; <\/span><span class=\"GEM3DMTCLFB ace_keyword\">library(\"psych\")\r\n<\/span><span class=\"GEM3DMTCLGB ace_keyword\">&gt; <\/span><span class=\"GEM3DMTCLFB ace_keyword\">iris_fa = fa(iris[-5])\r\n<\/span><span class=\"GEM3DMTCLGB ace_keyword\">&gt; <\/span><span class=\"GEM3DMTCLFB ace_keyword\">get_loadings(iris_fa)\r\n<\/span>          V1\r\n1  0.8713121\r\n2 -0.4225686\r\n3  0.9975472\r\n4  0.9646774<\/pre>\n<p>Above, we define a new function, <em>get_loadings()<\/em>, that fetches the loadings from a factor analysis object and transforms it into a clean data.frame by a roundabout way.* We see that the object returned did not keep the dimnames despite altho copy_names() being called. The fix is the same as above, calling assign with envir = parent.frame().<\/p>\n<p>* The reason to use the roundabout way is that the loadings extracted have some odd properties that make them unusable in many functions and they also refuse to be converted to a data.frame. But it turns out that one can just change the class to &#8220;matrix&#8221; and then they are fine! So one doesn&#8217;t actually need copy_names() in this case after all.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Recently, I wrote a function called copy_names(). It does what you think and a little more: it copies names from one object to another. But it can also attempt to do so even when the sizes of the objects&#8217; dimensions do not match up perfectly. For instance: &gt; t = matrix(1:9, nrow=3) &gt; t2 = [&hellip;]<\/p>\n","protected":false},"author":17,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[2089],"tags":[2279,2278,2280,1979],"class_list":["post-5718","post","type-post","status-publish","format-standard","hentry","category-programming","tag-callling-environment","tag-dimnames","tag-nested-functions","tag-r","entry"],"_links":{"self":[{"href":"https:\/\/emilkirkegaard.dk\/en\/wp-json\/wp\/v2\/posts\/5718","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/emilkirkegaard.dk\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/emilkirkegaard.dk\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/emilkirkegaard.dk\/en\/wp-json\/wp\/v2\/users\/17"}],"replies":[{"embeddable":true,"href":"https:\/\/emilkirkegaard.dk\/en\/wp-json\/wp\/v2\/comments?post=5718"}],"version-history":[{"count":1,"href":"https:\/\/emilkirkegaard.dk\/en\/wp-json\/wp\/v2\/posts\/5718\/revisions"}],"predecessor-version":[{"id":5719,"href":"https:\/\/emilkirkegaard.dk\/en\/wp-json\/wp\/v2\/posts\/5718\/revisions\/5719"}],"wp:attachment":[{"href":"https:\/\/emilkirkegaard.dk\/en\/wp-json\/wp\/v2\/media?parent=5718"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/emilkirkegaard.dk\/en\/wp-json\/wp\/v2\/categories?post=5718"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/emilkirkegaard.dk\/en\/wp-json\/wp\/v2\/tags?post=5718"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}