传递列的字符串及其交互的向量,以使用by,eval,get in data.table

时间:2017-07-06 08:18:17

标签: r data.table

df <- data.table(A=c(1:5), B=rep(1,1,2,2,3), C=(2,2,1,1,1), D=c(1,2,2,3,1))

col_inters <- c("A_B", "A", "B_C") 

如何迭代col_inters以获得下一个等价物:

 df[, eval("Inter.A_B") := mean(D), by=list(A,B)]
 df[, eval("Inter.A") := mean(D), by=A]
 df[, eval("Inter.B_C") := min(D), by=list(B,C)] 

需要继续解决方案:

 for(i in 1:NROW(col_inters)) {
      df[, eval(paste("Inter.", col_inters[i], sep="")) := mean(D), by=???]
 }

1 个答案:

答案 0 :(得分:0)

假设数据正确无误,我们split&#39; col_inters&#39;按_进入list,获取相应的函数和要创建的新列并使用Map

Map(function(cols, fn, new) df[, (new) := get(fn)(D), by = c(cols)],
           strsplit(col_inters, "_"), c("mean", "mean", "min"), 
            paste0("Inter." , col_inters))

df
#   A B C D Inter.A_B Inter.A Inter.B_C
#1: 1 1 2 1         1       1         1
#2: 2 1 2 2         2       2         1
#3: 3 2 1 2         2       2         2
#4: 4 2 1 3         3       3         2
#5: 5 3 1 1         1       1         1

或者我们可以使用for循环

lst <- strsplit(col_inters, "_")
fns <- c("mean", "mean", "min")
nm1 <- paste0("Inter." , col_inters)
for(i in seq_along(lst)) df[, (nm1[i]) := get(fns[i])(D), by = c(lst[[i]])][]
df
#    A B C D Inter.A_B Inter.A Inter.B_C
#1: 1 1 2 1         1       1         1
#2: 2 1 2 2         2       2         1
#3: 3 2 1 2         2       2         2
#4: 4 2 1 3         3       3         2
#5: 5 3 1 1         1       1         1

数据

df <- data.table(A=1:5, B= c(1,1,2,2,3), C=c(2,2,1,1,1), D=c(1,2,2,3,1))