我在data.table
以下
library(data.table)
DT = as.data.table(data.frame(Z=c("abc","abc","def","abc"), column=c(1,2,3,4), someOtherColumn=c(5,6,7,8)))
Fn = function(DT1) {
Value = as.numeric(DT1[1, 2])
Calc = sapply(DT1[, c("Z"):=NULL], sum) - Value
return(matrix(Calc, nr = 1, nc = length(Calc)))
}
现在,我想将Fn()
应用于由'Z'
组成的每个组,并得到具有2行(因为DT$Z
中有2个唯一成员)和2行的结果矩阵
DT[, Fn(.SD), by = Z, .SDcols = c('Z', 'column', 'someOtherColumn')]
但是与此同时,我得到了错误
Error in `[.data.table`(DT1, , `:=`(c("Z"), NULL)) :
.SD is locked. Using := in .SD's j is reserved for possible future use; a tortuously flexible way to modify by group. Use := in j directly to modify by group by reference.
我可以申请lapply()
来达到以下目标
do.call(rbind, lapply(split(DT, DT[['Z']]), Fn))
任何指向实现此目标的正确方法的指针都会有所帮助。
我有一个很大的DT, so am looking for some efficient method.
答案 0 :(得分:1)
我试图修复代码以使其运行-我不是data.table
专家,所以我无法深入了解其工作原理。也许这就是你所追求的。
我认为Fn
不能返回矩阵,因为'j'
必须是列表或原子向量。
Fn = function(DT1) {
Value = as.numeric(DT1[1, 2])
Calc = DT1[, lapply(.SD, sum) , .SDcols = -"Z"] - Value
list(matrix(Calc, nrow = 1, ncol = length(Calc)))
}
out <- DT[, .(Fn(.SD)), by = Z, .SDcols = c("Z", "column", "someOtherColumn")]
> out
# Z V1
# 1: abc <matrix>
# 2: def <matrix>
# b$V1
# [[1]]
# [,1] [,2]
# [1,] 6 18
#
# [[2]]
# [,1] [,2]
# [1,] 0 4