将列添加到数据框以计算日志返回

时间:2016-06-15 23:24:36

标签: r

我有一个包含时间序列财务数据的数据框,我想计算每个数据的日志回报。

这是一个简化的例子(实际上,我有很多专栏):

df <- data.frame(Date=c("2004/10/29","2004/11/30","2004/12/31","2005/01/31"), B126 =c("103.238","104.821","105.141","107.682"), H251 =c("131.149","138.989","137.266","137.080")) df

        Date    B126    H251
1 2004/10/29 103.238 131.149
2 2004/11/30 104.821 138.989
3 2004/12/31 105.141 137.266
4 2005/01/31 107.682 137.080

我想得到以下内容:

        Date    B126     Log      H251     Log
1 2004/10/29 103.238           131.149 
2 2004/11/30 104.821  0.0152   138.989  0.0580
3 2004/12/31 105.141  0.0030   137.266 -0.0124
4 2005/01/31 107.682  0.0238   137.080 -0.0013

我知道如何使用以下方法获取每列的日志返回值:

logB126 <- DF$B126 log_returns <- diff(log(logB126), lag = 1)

我不可能重复上述步骤一百次,所以我想知道是否有更好的方法来执行任务?

4 个答案:

答案 0 :(得分:3)

您可以使用plyr::colwise

calc_log_return <- function(x) diff(log(x), lag = 1)
logReturns <- plyr::colwise(calc_log_return)(DF[, -1])

这将只生成日志返回的新data.frame。您可以轻松附加日期列。

答案 1 :(得分:1)

简单for循环应该完成这项工作:

df2 <- df[,1:2]
for(name in names(df)[2:length(names(df))]){
    df2[,name] <- df[,name]
    df2[2:nrow(df2),paste0(name, ".Log")] <- diff(log(as.numeric(as.character(df[,name]))), lag = 1)
}
head(df2)

答案 2 :(得分:0)

我们可以使用mutate_each

中的dplyr
library(dplyr)
df %>% mutate_each(funs(round(c(NA, diff(log(as.numeric(as.character(.))))),3)),
                                     B126:H251)
#        Date  B126   H251
#1 2004/10/29    NA     NA
#2 2004/11/30 0.015  0.058
#3 2004/12/31 0.003 -0.012
#4  2005/0131 0.024 -0.001

答案 3 :(得分:0)

另一个dplyr解决方案。应用mutate_each后,使用基础R中的合并将新列添加到原始数据中。

track by

输出:

library(dplyr)

# clean up data (convert strings to numbers)
df <- df %>% mutate_each(funs(as.numeric(as.character(.))), B126:H251)

# calculate log diff and merge
df %>% merge(df %>% mutate_each(funs(c(NA,diff(log(.)))), B126:H251), by='Date', suffixes=c('','_log'))

# optionally apply rounding function
df %>% mutate_each(funs(round(.,3)), B126_log:H251_log)