当将具有多个输出变量(例如,列表)的函数应用于data.table的子集时,我丢失了变量名称。有没有办法保留它们?
library(data.table)
foo <- function(x){
list(mn = mean(x), sd = sd(x))
}
bar <- data.table(x=1:8, y=c("d","e","f","g"))
# column names "mn" and "sd" are replaced by "V1" and "V2"
bar[, sapply(.SD, foo), by = y, .SDcols="x"]
# column names "mn" and "sd" are retained
bar_split <- split(bar$x, bar$y)
t(sapply(bar_split, foo))
答案 0 :(得分:11)
我会想到以下内容,这有点尴尬,但无论有多少功能,都不需要手动编写名称
set.seed(1)
bar[, z := sample(8)]
bar[, as.list(unlist(lapply(.SD, foo))), by = y, .SDcols = c("x", "z")]
# y x.mn x.sd z.mn z.sd
# 1: d 3 2.828427 2.0 1.4142136
# 2: e 4 2.828427 7.5 0.7071068
# 3: f 5 2.828427 3.0 1.4142136
# 4: g 6 2.828427 5.5 0.7071068
这种方法的最大优点是它将函数与列名绑定在一起。例如,如果您有一个额外的列,它仍然会在使用与上面相同的代码时提供信息性的结果
colection.products_count
答案 1 :(得分:2)
setNames函数允许您添加缺少的字符向量。:
bar[, setNames( sapply(.SD, foo), c("mn", "sd")), by = y, .SDcols="x"]
y mn sd
1: d 3 2.828427
2: e 4 2.828427
3: f 5 2.828427
4: g 6 2.828427
作者建议使用Arenburg建议的另一种形式:
DT[, c('x2', 'y2') := list(x / sum(x), y / sum(y)), by = grp]