我有一个data.table
,其中40列代表连续40个时期的收入。我试图为每个观察添加一个代表收入流NPV的变量(即$ \ sum_ {t = 1} ^ T \ beta ^ {t-1} y_ {i,t} $,打折收入总和)。
我的方法是:
dt[,NPV:=rowSums(.SD*.95^(0:39)),.SDcols=paste0("year_",1:40)]
但这会产生奇怪的结果。事实上,.SD*.96^(0:39)
本身正在做一些我不明白的事情 - 我想问题是它不知道如何将.SD
与向量.95^(0:39)
相乘。必须回收......
鉴于此,我尝试了某种lapply
来处理该产品,但这并不起作用;接下来,将问题指定为矩阵乘法.SD %*% .95^(0:39)
也不起作用。
有关该怎么做的任何想法?也许reshape
并从那里开始......
具体而言,这是一个你可以在5个时期玩的例子。
set.seed(3654654)
dt<-data.table(id=1:10,year_1=rchisq(10,df=1),
year_2=rchisq(10,df=1),
year_3=rchisq(10,df=1),
year_4=rchisq(10,df=1),
year_5=rchisq(10,df=1))
> dt
id year_1 year_2 year_3 year_4 year_5
1: 1 0.27161866 0.12764396 0.2775017833 5.210941183 0.027654609
2: 2 2.44271387 1.21104397 0.1242118874 0.009518939 3.265443502
3: 3 0.18095011 0.06581832 1.1619364400 0.938078133 2.238590035
4: 4 0.02148331 3.38477084 0.1254167045 0.041640559 0.212538797
5: 5 1.27821958 0.19046799 3.1166384038 0.586280661 0.019470595
6: 6 0.03413820 0.68214806 0.9325970029 0.568719470 0.061664982
7: 7 2.32055628 0.04137301 0.1810722845 0.050654213 1.377958712
8: 8 0.95498438 0.03095528 0.7081911061 3.127335761 2.293907090
9: 9 4.49044959 1.75553222 0.0005865227 0.207076713 0.577015216
10: 10 0.02984232 0.02522646 0.3891819870 0.178056404 0.006526457
所以净现值应该是:
[,1]
[1,] 5.1335813
[2,] 6.3731923
[3,] 3.9197555
[4,] 3.5590199
[5,] 4.7904516
[6,] 2.0616800
[7,] 3.6890640
[8,] 6.1732355
[9,] 6.8062594
[10,] 0.5630211
以下我到目前为止所做的尝试给了我:
> dt[,rowSums(.SD*.95^(0:4)),.SDcols=paste0("year_",1:5)]
[1] 5.9153602 6.7002856 4.1382992 3.2458933 4.2281649
2.2792677 3.7730338 6.4216247 6.0279123 0.5121889
(完全不正确 - 为什么?出于同样的原因,这不起作用:
> dt[,.SD*.95^(0:4),.SDcols=paste0("year_",1:5)]
year_1 year_2 year_3 year_4 year_5
1: 0.27161866 0.12764396 0.2775017833 5.210941183 0.02765461
2: 2.32057818 1.15049177 0.1180012931 0.009042992 3.10217133
3: 0.16330748 0.05940104 1.0486476371 0.846615515 2.02032751
4: 0.01841926 2.90201790 0.1075291471 0.035701574 0.18222545
5: 1.04111784 0.15513737 2.5385214589 0.477529263 0.01585892
6: 0.03413820 0.68214806 0.9325970029 0.568719470 0.06166498
7: 2.20452847 0.03930436 0.1720186702 0.048121502 1.30906078
8: 0.86187340 0.02793714 0.6391424733 2.822420524 2.07025115
9: 3.84999922 1.50514943 0.0005028699 0.177542396 0.49471842
10: 0.02430675 0.02054711 0.3169911608 0.145028054 0.00531584
- 似乎是在行中而不是在列之间相乘)
> dt[,.SD %*% .95^(0:4),.SDcols=paste0("year_",1:5)]
Error in .SD %*% 0.95^(0:4) :
requires numeric/complex matrix/vector arguments
答案 0 :(得分:4)
试试这个:
> dt[, as.matrix(.SD) %*% 0.95 ^ (0:4), .SDcols = -1]
[,1]
[1,] 5.1335813
[2,] 6.3731923
[3,] 3.9197555
[4,] 3.5590199
[5,] 4.7904516
[6,] 2.0616800
[7,] 3.6890640
[8,] 6.1732355
[9,] 6.8062594
[10,] 0.5630211
或:
as.matrix(dt[, -1, with = FALSE]) %*% 0.95 ^ (0:4)
更新:根据评论进行小幅改进。
答案 1 :(得分:3)
这是一种利用data.table的方式:
vs <- paste0("year_",1:5)
exps <- 1:5 - 1
dt[,NPV:=Reduce(
`+`,
mapply(
function(x,y) x*.95^y,
.SD,
exps,
SIMPLIFY=FALSE)
),.SDcols=vs]
mapply
将双参数函数应用于两个列表.SD
和exps
中的元素对;并Reduce
将结果与+
折叠。当然,你可以把它写在一行上。
答案 2 :(得分:0)
#Using data.frame: df is your data frame and assuming that year 1 indicates
#the beginning of the year and so discount factor is equal to 0 for the first
#year and 0.95 for the second year. In the data frame, year1 starts in column2
#2 and year 5 is the last column
df<-data.frame(dt)
NPV<-rowSums(sapply(2:ncol(df),function(i){df[,i]*0.95^(i-2)}))
> NPV
[1] 5.1335813 6.3731923 3.9197555 3.5590199 4.7904516 2.0616800 3.6890640 6.1732355 6.8062594 0.5630211