Question

我有一个data.table，其中40列代表连续40个时期的收入。我试图为每个观察添加一个代表收入流NPV的变量（即$ \ sum_ {t = 1} ^ T \ beta ^ {t-1} y_ {i，t} $，打折收入总和）。

我的方法是：

dt[,NPV:=rowSums(.SD*.95^(0:39)),.SDcols=paste0("year_",1:40)]

但这会产生奇怪的结果。事实上，.SD*.96^(0:39)本身正在做一些我不明白的事情 - 我想问题是它不知道如何将.SD与向量.95^(0:39)相乘。必须回收......

鉴于此，我尝试了某种lapply来处理该产品，但这并不起作用;接下来，将问题指定为矩阵乘法.SD %*% .95^(0:39)也不起作用。

有关该怎么做的任何想法？也许reshape并从那里开始......

具体而言，这是一个你可以在5个时期玩的例子。

set.seed(3654654)
dt<-data.table(id=1:10,year_1=rchisq(10,df=1),
              year_2=rchisq(10,df=1),
              year_3=rchisq(10,df=1),
              year_4=rchisq(10,df=1),
              year_5=rchisq(10,df=1))

> dt
    id     year_1     year_2       year_3      year_4      year_5
 1:  1 0.27161866 0.12764396 0.2775017833 5.210941183 0.027654609
 2:  2 2.44271387 1.21104397 0.1242118874 0.009518939 3.265443502
 3:  3 0.18095011 0.06581832 1.1619364400 0.938078133 2.238590035
 4:  4 0.02148331 3.38477084 0.1254167045 0.041640559 0.212538797
 5:  5 1.27821958 0.19046799 3.1166384038 0.586280661 0.019470595
 6:  6 0.03413820 0.68214806 0.9325970029 0.568719470 0.061664982
 7:  7 2.32055628 0.04137301 0.1810722845 0.050654213 1.377958712
 8:  8 0.95498438 0.03095528 0.7081911061 3.127335761 2.293907090
 9:  9 4.49044959 1.75553222 0.0005865227 0.207076713 0.577015216
10: 10 0.02984232 0.02522646 0.3891819870 0.178056404 0.006526457

所以净现值应该是：

           [,1]
 [1,] 5.1335813
 [2,] 6.3731923
 [3,] 3.9197555
 [4,] 3.5590199
 [5,] 4.7904516
 [6,] 2.0616800
 [7,] 3.6890640
 [8,] 6.1732355
 [9,] 6.8062594
[10,] 0.5630211

以下我到目前为止所做的尝试给了我：

> dt[,rowSums(.SD*.95^(0:4)),.SDcols=paste0("year_",1:5)]
 [1] 5.9153602 6.7002856 4.1382992 3.2458933 4.2281649
     2.2792677 3.7730338 6.4216247 6.0279123 0.5121889

（完全不正确 - 为什么？出于同样的原因，这不起作用：

> dt[,.SD*.95^(0:4),.SDcols=paste0("year_",1:5)]
        year_1     year_2       year_3      year_4     year_5
 1: 0.27161866 0.12764396 0.2775017833 5.210941183 0.02765461
 2: 2.32057818 1.15049177 0.1180012931 0.009042992 3.10217133
 3: 0.16330748 0.05940104 1.0486476371 0.846615515 2.02032751
 4: 0.01841926 2.90201790 0.1075291471 0.035701574 0.18222545
 5: 1.04111784 0.15513737 2.5385214589 0.477529263 0.01585892
 6: 0.03413820 0.68214806 0.9325970029 0.568719470 0.06166498
 7: 2.20452847 0.03930436 0.1720186702 0.048121502 1.30906078
 8: 0.86187340 0.02793714 0.6391424733 2.822420524 2.07025115
 9: 3.84999922 1.50514943 0.0005028699 0.177542396 0.49471842
10: 0.02430675 0.02054711 0.3169911608 0.145028054 0.00531584

- 似乎是在行中而不是在列之间相乘）

> dt[,.SD %*% .95^(0:4),.SDcols=paste0("year_",1:5)]
Error in .SD %*% 0.95^(0:4) : 
  requires numeric/complex matrix/vector arguments

Answer 1

试试这个：

> dt[, as.matrix(.SD) %*% 0.95 ^ (0:4), .SDcols = -1]
           [,1]
 [1,] 5.1335813
 [2,] 6.3731923
 [3,] 3.9197555
 [4,] 3.5590199
 [5,] 4.7904516
 [6,] 2.0616800
 [7,] 3.6890640
 [8,] 6.1732355
 [9,] 6.8062594
[10,] 0.5630211

或：

as.matrix(dt[, -1, with = FALSE]) %*% 0.95 ^ (0:4)

更新：根据评论进行小幅改进。

Answer 2

这是一种利用data.table的方式：

vs   <- paste0("year_",1:5)
exps <- 1:5 - 1

dt[,NPV:=Reduce(
  `+`,
  mapply(
    function(x,y) x*.95^y,
    .SD,
    exps,
    SIMPLIFY=FALSE)
),.SDcols=vs]

mapply将双参数函数应用于两个列表.SD和exps中的元素对;并Reduce将结果与+折叠。当然，你可以把它写在一行上。

Answer 3

#Using data.frame: df is your data frame and assuming that year 1 indicates 
#the beginning of the year and so discount factor is equal to 0 for the first 
#year and 0.95 for the second year. In the data frame, year1 starts in column2
#2 and year 5 is the last column

 df<-data.frame(dt)
NPV<-rowSums(sapply(2:ncol(df),function(i){df[,i]*0.95^(i-2)}))
> NPV
 [1] 5.1335813 6.3731923 3.9197555 3.5590199 4.7904516 2.0616800 3.6890640 6.1732355 6.8062594 0.5630211

计算R data.table中一系列列的净现值

3 个答案: