每个data.table行与另一个data.table行的关联

时间:2019-01-09 15:49:16

标签: r data.table

我有如下数据:

set.seed(1)
portfolio_profit_daily <- data.table(`Stat`="Profit","2019-01-01"=rnorm(1), "2019-01-02"=rnorm(1), "2019-01-03"=rnorm(1), "2019-01-04"=rnorm(1), "2019-01-05"=rnorm(1))
asset_forecasted_daily <- data.table(`Asset`=c("A1","A2","A3","A4"),"2019-01-01"=rnorm(4), "2019-01-02"=rnorm(4), "2019-01-03"=rnorm(4), "2019-01-04"=rnorm(4), "2019-01-05"=rnorm(4),
                                 "2019-01-06"=rnorm(4),"2019-01-07"=rnorm(4),"2019-01-08"=rnorm(4),"2019-01-09"=rnorm(4),"2019-01-10"=rnorm(4))

data.table portfolio_profit_daily包含整个投资组合的每日收益(一个月的5天过去了)。 asset_forecasted_daily包含几只股票的预测收益(请注意预测期为10天)。我想在asset_forecasted_daily中添加新列,以计算每只股票预测的前5天与每日投资组合收益的5天之间的相关性。

非常感谢。

1 个答案:

答案 0 :(得分:1)

像这样吗?

colvar <- names(portfolio_profit_daily)[-1L]

#extract portfolio returns
y <- unlist(portfolio_profit_daily[,..colvar])

#calculate correlation for each row using returns for the same portfolio dates
asset_forecasted_daily[, myCorr := cor(unlist(.SD), y), 
    by=1:nrow(asset_forecasted_daily), .SDcols=colvar]

输出:

   Asset 2019-01-01 2019-01-02  2019-01-03 2019-01-04  2019-01-05  2019-01-06 2019-01-07  2019-01-08 2019-01-09 2019-01-10       myCorr
1:    A1 -0.8204684 -0.3053884 -2.21469989  0.9438362  0.78213630 -0.05612874  0.4179416 -0.05380504 -0.0593134 -0.2533617  0.863474950
2:    A2  0.4874291  1.5117812  1.12493092  0.8212212  0.07456498 -0.15579551  1.3586796 -1.37705956  1.1000254  0.6969634 -0.071469479
3:    A3  0.7383247  0.3898432 -0.04493361  0.5939013 -1.98935170 -1.47075238 -0.1027877 -0.41499456  0.7631757  0.5566632 -0.004912902
4:    A4  0.5757814 -0.6212406 -0.01619026  0.9189774  0.61982575 -0.47815006  0.3876716 -0.39428995 -0.1645236 -0.6887557  0.431029254