将data.tables j插槽(变量名,函数和参数)作为参数传递

时间:2019-08-19 12:06:58

标签: r data.table

我想将要在data.table的j插槽中执行的功能作为参数提供给以下方式:

   DT <- as.data.table(structure(list(peak.grp = c(1L, 2L, 2L, 2L, 2L), s = c(248, 264, 
282, 304, 333), height = c(222772.8125, 370112.28125, 426524.03125, 649691.75, 698039)), class = "data.frame", row.names = c(NA, 
-5L)))

list_of_functions_with_parameters <- list(sum = list(x = s, na.rm = TRUE), mean = list(x = height, na.rm = TRUE))

vector_of_variable_names <- c("Sum.s", "Mean.height")

vector_for_by <- c("peak.grp")

perform_dt_operations <-
    function(DT, vector_of_variable_names, list_of_functions_with_parameters, vector_for_by){

    DT <- DT[, .(vector_of_variable_names = list_of_functions_with_parameters), by = row.names(DT)]

    return(DT)

}

输出应为:

Output <- perform_dt_operations(DT, vector_of_variable_names, list_of_functions_with_parameters, vector_for_by)


dput(as.data.frame(Output))

structure(list(peak.grp = c(1, 2), Sum.s = c(248, 1183), Mean.height = c(222772.8125, 
536091.765625)), row.names = c(NA, -2L), class = "data.frame")

有没有办法做类似的事情?

1 个答案:

答案 0 :(得分:3)

仅当引用list_of_functions_with_parameters的元素时才有可能,这意味着它必须是alist

list_of_functions_with_parameters <- alist(sum = list(x = s, na.rm = TRUE), 
                                       mean = list(x = height, na.rm = TRUE))

vector_of_variable_names <- c("Sum.s", "Mean.height")

vector_for_by <- c("peak.grp")

perform_dt_operations <-
  function(DT, vector_of_variable_names, list_of_functions_with_parameters, vector_for_by){

    stopifnot(length(vector_of_variable_names) == length(list_of_functions_with_parameters))

    DT[,{
      res <- list()
      for (i in seq_along(vector_of_variable_names)) {
        l <- eval(list_of_functions_with_parameters[[i]]) #evaluate within .SD
        res[vector_of_variable_names[i]] <- 
              do.call(names(list_of_functions_with_parameters)[i], l)
      }
      res       
    }, by = vector_for_by]
  }

perform_dt_operations(DT, vector_of_variable_names,
  list_of_functions_with_parameters, vector_for_by)

#   peak.grp Sum.s Mean.height
#1:        1   248    222772.8
#2:        2  1183    536091.8

如您所见,这是一些相当复杂的代码。我不确定我会推荐这种方法。