我想为多个不同长度的变量的子集导出摘要统计。
尝试使用on
中的data.table
参数与CJ
结合使用,我得到了一个解决方案,其打字工作比使用键控变量要多得多。有没有更好的解决方案来完成这项任务?
require(data.table)
dt <- CJ(year = 2015:2016, class = 1:4, age = 15:20)
set.seed(1)
dt[, var := rnorm(48)]
dt[CJ(class = 2:3, age = 15:17),
list(med = median(var)), on = c("class", "age"),
keyby = .(age, year, class)]
## age year class med
## 1: 15 2015 2 0.4874291
## 2: 15 2015 3 -0.6212406
## 3: 15 2016 2 1.3586796
## 4: 15 2016 3 -0.3942900
## 5: 16 2015 2 0.7383247
## 6: 16 2015 3 -2.2146999
## 7: 16 2016 2 -0.1027877
## 8: 16 2016 3 -0.0593134
## 9: 17 2015 2 0.5757814
## 10: 17 2015 3 1.1249309
## 11: 17 2016 2 0.3876716
## 12: 17 2016 3 1.1000254
##### keyed version -- less typing work
setkey(dt, class, age)
dt[CJ(2:3, 15:17), list(med = median(var)),
keyby = .(age, year, class)]
##### same result as above