将多个参数传递给函数内部的data.table

时间:2019-07-29 11:43:35

标签: r data.table

这是我想要的data.table的输出。

library(data.table)
dt_mtcars <- as.data.table(mtcars)

## desired output ----
dt_mtcars[mpg >20
          , .(mean_mpg = mean(mpg)
              ,median_mpg = median(mpg))
          , .(cyl, gear)]

   cyl gear mean_mpg median_mpg
1:   6    4   21.000      21.00
2:   4    4   26.925      25.85
3:   6    3   21.400      21.40
4:   4    3   21.500      21.50
5:   4    5   28.200      28.20

我想通过将参数传递给函数来获得输出。

processFUN <- function(dt, where, select, group){

  out <- dt[i=eval(parse(text = where))
            ,j=eval(parse(text = select))
            ,by=eval(parse(text = group))]

  return(out)
}

report <- processFUN(dt_mtcars 
                     ,where= "mpg > 20"
                     ,select= ".(mean_mpg = mean(mpg), median_mpg = median(mpg))"
                     ,group= ".(cyl, gear)")

但是,我收到一条错误消息。

 Error in .(cyl, gear) : could not find function "." 

3 个答案:

答案 0 :(得分:4)

只要给您一个选择, 如果您可以/想要使用table.express, 您还可以在许多情况下使用字符串:

library(data.table)
library(table.express)

processFUN <- function(dt, where, select, group) {
  dt %>%
    start_expr %>%
    group_by(!!!group, .parse = TRUE) %>%
    where(!!!where, .parse = TRUE) %>%
    transmute(!!!select, .parse = TRUE) %>%
    end_expr
}

processFUN(as.data.table(mtcars),
           "mpg>20",
           c("mean_mpg = mean(mpg)", "median_mpg = median(mpg)"),
           c("cyl", "gear"))
   cyl gear     V1    V2
1:   6    4 21.000 21.00
2:   4    4 26.925 25.85
3:   6    3 21.400 21.40
4:   4    3 21.500 21.50
5:   4    5 28.200 28.20

在下一版本中,start_exprend_expr将是可选的。

答案 1 :(得分:3)

您真的要通过字符串形式传递条件吗?如果是这样,一种方法是使用paste一起构造查询,然后使用eval(parse...对其进行评估

library(data.table)

processFUN <- function(dt, where, select, group){
    eval(parse(text = paste0(as.character(substitute(dt)), "[", where, ",", 
               select, ",by = ", group, "]")))
}

processFUN(dt_mtcars 
          ,where= "mpg > 20"
          ,select= ".(mean_mpg = mean(mpg), median_mpg = median(mpg))"
          ,group= ".(cyl, gear)")


#   cyl gear mean_mpg median_mpg
#1:   6    4   21.000      21.00
#2:   4    4   26.925      25.85
#3:   6    3   21.400      21.40
#4:   4    3   21.500      21.50
#5:   4    5   28.200      28.20

答案 2 :(得分:1)

或将evalsubstitute一起使用:

library(data.table) #Win R-3.5.1 x64 data.table_1.12.2
dt_mtcars <- as.data.table(mtcars)

processFUN <- function(dt, where, select, group) {

    out <- dt[i=eval(substitute(where)), 
        j=eval(substitute(select)), 
        by=eval(substitute(group))]

    return(out)
}

processFUN(dt_mtcars, mpg>20, .(mean_mpg=mean(mpg), median_mpg=median(mpg)), .(cyl, gear))

我可以找到的一些最早的参考文献

  1. Aggregating sub totals and grand totals with data.table
  2. Using data.table i and j arguments in functions

旧的常见问题解答1.6包含对此的引用: http://datatable.r-forge.r-project.org/datatable-faq.pdf