在自定义函数中对data.cube进行子集化

时间:2016-10-20 11:53:54

标签: r function data.table subset data.cube

我正在尝试使用我自己的函数来在R中对data.cube进行子集化,并自动格式化我想要构建的一些预定义图的结果。

这是我的功能。

require(data.table)
require(data.cube)

secciona <- function(cubo  = NULL, 
                     fecha_valor = list(), 
                     loc_valor = list(), 
                     prod_valor = list(), 
                     drop = FALSE){

    cubo[fecha_valor, loc_valor, prod_valor, drop = drop]

    ## The line above will really be an asignment of type y <- format(cubo[...drop])
    ## Rest of code which will end up plotting the subset of the function
}

问题是我继续收到错误:Error in eval(expr, envir, enclos) : object 'fecha_valor' not found

对我来说最奇怪的是,在控制台上一切正常,但在我的子集功能中定义时却没有。

在控制台中:

> dc[list(as.Date("2013/01/01"))]
> dc[list(as.Date("2013/01/01")),]
> dc[list(as.Date("2013/01/01")),,]
> dc[list(as.Date("2013/01/01")),list(),list()]

所有结果都是:

<data.cube>
fact:
  5627 rows x 2 dimensions x 1 measures (0.32 MB)
dimensions:
  localizacion : 4 entities x 3 levels (0.01 MB)
  producto : 153994 entities x 3 levels (21.29 MB)
total size: 21.61 MB

但每当我尝试

secciona(dc)
secciona(dc, fecha_valor = list(as.Date("2013/01/01")))
secciona(dc, fecha_valor = list())

我总是得到上面提到的错误。

为什么会发生这种情况?我应该以其他方式继续编辑绘图子集的方法吗?

1 个答案:

答案 0 :(得分:2)

This is the standard issue that R users will face when dealing with non-standard evaluation. This is a consequence of Computing on the language R language feature.
[.data.cube function expects to be used in interactive way, that extends the flexibility of the arguments passed to it, but gives some restrictions. In that aspect it is similar to [.data.table when passing expressions from wrapper function to [ subset operator. I've added dummy example to make it reproducible.

I see you are already using data.cube-oop branch, so just to clarify for other readers. data.cube-oop branch is 92 commits ahead of master branch, to install use the following.

install.packages("data.cube", repos = paste0("https://", c(
    "jangorecki.gitlab.io/data.cube",
    "Rdatatable.github.io/data.table",
    "cran.rstudio.com"
)))

library(data.cube)
set.seed(1)
ar = array(rnorm(8,10,5), rep(2,3), 
           dimnames = list(color = c("green","red"), 
                           year = c("2014","2015"), 
                           country = c("IN","UK"))) # sorted
dc = as.data.cube(ar)

f = function(color=list(), year=list(), country=list(), drop=FALSE){
    expr = substitute(
        dc[color=.color, year=.year, country=.country, drop=.drop],
        list(.color=color, .year=year, .country=country, .drop=drop)
    )
    eval(expr)
}
f(year=list(c("2014","2015")), country="UK")
#<data.cube>
#fact:
#  4 rows x 3 dimensions x 1 measures (0.00 MB)
#dimensions:
#  color : 2 entities x 1 levels (0.00 MB)
#  year : 2 entities x 1 levels (0.00 MB)
#  country : 1 entities x 1 levels (0.00 MB)
#total size: 0.01 MB

You can track the expression just by putting print(expr) before/instead eval(expr).

Read more about non-standard evaluation:
- R Language Definition: Computing on the language
- Advanced R: Non-standard evaluation
- manual of substitute function
And some related SO questions:
- Passing on non-standard evaluation arguments to the subset function
- In R, why is [ better than subset?