当文件包含不同的股票时计算cumprod

时间:2014-05-14 21:54:05

标签: r

我正在尝试在具有累积回报的数据框中创建列。数据框有多个股票,由一列表示," n"。我想计算每个股票的累积回报,而不需要将数据框架分成股票数量。请看这个例子:

library("lubridate")
n=c("IBM","IBM","IBM","IBM","IBM","IBM","IBM","IBM","IBM","IBM",
   "AAPL","AAPL","AAPL","AAPL","AAPL","AAPL","AAPL","AAPL","AAPL","AAPL",
   "GOOG","GOOG","GOOG","GOOG","GOOG","GOOG","GOOG","GOOG","GOOG","GOOG"
   )

dt=c("20140407","20140408","20140409","20140410","20140411",
     "20140414","20140415","20140416","20140417","20140418",
     "20140407","20140408","20140409","20140410","20140411",
     "20140414","20140415","20140416","20140417","20140418",
     "20140407","20140408","20140409","20140410","20140411",
     "20140414","20140415","20140416","20140417","20140418"
     )
ret=c(0.006,0.049,0.069,0.068,0.062,0.035,0.001,0.048,0.034,0.025,
     0.068,0.002,0.042,0.036,0.01,0.006,0.074,0.021,0.005,0.028,
     0.082,0.041,0.036,0.083,0.012,0.031,0.061,0.032,0.061,0.041
     )

df=data.frame(n,dt,ret)
df$dt<-ymd(as.character(df$dt))
df$cum_ret <-cumprod(1+df$ret)

如您所见,此数据集有3个股票。我想更改df $ cum_ret,以便它是每个股票的累积回报(在第34栏中注明; n&#34;)

有没有人有任何想法?非常感谢你!

2 个答案:

答案 0 :(得分:1)

> s <- split(df, df$n)
> ll <- lapply(s, function(x) cbind(x, cum_ret = cumprod(1+x$ret)))
> unsplit(ll, df$n)
#       n       dt   ret  cum_ret
# 1   IBM 20140407 0.006 1.006000
# 2   IBM 20140408 0.049 1.055294
# 3   IBM 20140409 0.069 1.128109
# ...
# 11 AAPL 20140407 0.068 1.068000
# 12 AAPL 20140408 0.002 1.070136
# 13 AAPL 20140409 0.042 1.115082
# ...
# 21 GOOG 20140407 0.082 1.082000
# 22 GOOG 20140408 0.041 1.126362
# 23 GOOG 20140409 0.036 1.166911
# ...

答案 1 :(得分:0)

您可以使用包dplyr执行此操作。

require(dplyr)     #install the package and load it into library

#group the data by "n" and calculate the cumulative sums of returns

df <- df %.% group_by(n) %.% mutate(cum_ret = cumsum(ret))     

编辑:为了确保在计算累积回报时按日期排序n组,您可以在操作中包含arrange

df <- df %.% arrange(n, dt) %.% group_by(n) %.% mutate(cum_ret = cumsum(ret))