我正在尝试在具有累积回报的数据框中创建列。数据框有多个股票,由一列表示," n"。我想计算每个股票的累积回报,而不需要将数据框架分成股票数量。请看这个例子:
library("lubridate")
n=c("IBM","IBM","IBM","IBM","IBM","IBM","IBM","IBM","IBM","IBM",
"AAPL","AAPL","AAPL","AAPL","AAPL","AAPL","AAPL","AAPL","AAPL","AAPL",
"GOOG","GOOG","GOOG","GOOG","GOOG","GOOG","GOOG","GOOG","GOOG","GOOG"
)
dt=c("20140407","20140408","20140409","20140410","20140411",
"20140414","20140415","20140416","20140417","20140418",
"20140407","20140408","20140409","20140410","20140411",
"20140414","20140415","20140416","20140417","20140418",
"20140407","20140408","20140409","20140410","20140411",
"20140414","20140415","20140416","20140417","20140418"
)
ret=c(0.006,0.049,0.069,0.068,0.062,0.035,0.001,0.048,0.034,0.025,
0.068,0.002,0.042,0.036,0.01,0.006,0.074,0.021,0.005,0.028,
0.082,0.041,0.036,0.083,0.012,0.031,0.061,0.032,0.061,0.041
)
df=data.frame(n,dt,ret)
df$dt<-ymd(as.character(df$dt))
df$cum_ret <-cumprod(1+df$ret)
如您所见,此数据集有3个股票。我想更改df $ cum_ret,以便它是每个股票的累积回报(在第34栏中注明; n&#34;)
有没有人有任何想法?非常感谢你!
答案 0 :(得分:1)
> s <- split(df, df$n)
> ll <- lapply(s, function(x) cbind(x, cum_ret = cumprod(1+x$ret)))
> unsplit(ll, df$n)
# n dt ret cum_ret
# 1 IBM 20140407 0.006 1.006000
# 2 IBM 20140408 0.049 1.055294
# 3 IBM 20140409 0.069 1.128109
# ...
# 11 AAPL 20140407 0.068 1.068000
# 12 AAPL 20140408 0.002 1.070136
# 13 AAPL 20140409 0.042 1.115082
# ...
# 21 GOOG 20140407 0.082 1.082000
# 22 GOOG 20140408 0.041 1.126362
# 23 GOOG 20140409 0.036 1.166911
# ...
答案 1 :(得分:0)
您可以使用包dplyr
执行此操作。
require(dplyr) #install the package and load it into library
#group the data by "n" and calculate the cumulative sums of returns
df <- df %.% group_by(n) %.% mutate(cum_ret = cumsum(ret))
编辑:为了确保在计算累积回报时按日期排序n
组,您可以在操作中包含arrange
:
df <- df %.% arrange(n, dt) %.% group_by(n) %.% mutate(cum_ret = cumsum(ret))