R data.table,J中用户定义函数的命名空间

时间:2015-12-01 10:41:25

标签: r data.table

我有一个如下所示的数据表。我想计算每个市场的每个信号的回报相关性。

dt = data.table(mkt = rep(letters[1:3], each = 3), rtn = rnorm(9), signal1=rnorm(9), signal2=rnorm(9), signal3 = rnorm(9))
   mkt      rtn    signal1     signal2    signal3
1:   a  0.2488643  0.4110516 -0.04861252 -1.3599824
2:   a  1.3387256 -0.4418436 -0.17055841 -1.2161698
3:   a -1.4058236 -1.2624645 -0.24315048 -1.2722546
4:   b  1.7056606  0.2618591  2.60779232  0.7786226
5:   b  0.7913587 -1.0596116  0.31152541  1.7336651
6:   b -1.8690651  0.1942825  0.95430075 -0.7030462
7:   c -0.4937575 -1.8645226 -0.32312077 -1.7138482
8:   c -0.7153342 -0.5142624 -0.43817789 -1.3637261
9:   c  0.3766730 -0.0954339  0.71159756 -1.2118075

dt[, lapply(.SD, function(x) cor(x, rtn, use = 'c')), .SDcols = 3:5, by = mkt]
Error in is.data.frame(y) : object 'rtn' not found

如何让J中的匿名函数知道rtn列?

1 个答案:

答案 0 :(得分:2)

我认为一种方法是将其包含在.SDcols中,以便匿名函数能够找到rtn,然后可能会删除rtn列(因为它只会将1作为值,因为它将与自身相关):

dt[, lapply(.SD, function(x) cor(x, rtn, use = 'c')), .SDcols = c(2, 3:5), by = mkt]

   mkt rtn    signal1    signal2    signal3
1:   a   1  0.6759421 -0.5037837  0.8605805
2:   b   1 -0.8494135  0.6720274  0.7832928
3:   c   1 -0.9425291  0.5683629 -0.9976231

然后你可以这样做:

dt2 <- dt[, lapply(.SD, function(x) cor(x, rtn, use = 'c')), .SDcols = c(2, 3:5), by = mkt]
dt2[, rtn := NULL]
dt2
#   mkt    signal1    signal2    signal3
#1:   a  0.6759421 -0.5037837  0.8605805
#2:   b -0.8494135  0.6720274  0.7832928
#3:   c -0.9425291  0.5683629 -0.9976231